Senior Software Engineer/Senior Architect

Full-time
USA
Posted 1 year ago
Go ad-free with Premium ×
The job listing has expired. Unfortunately, the hiring company is no longer accepting new applications.

To see similar active jobs please follow this link: Remote Development jobs

We are seeking a talented and experienced Data Engineer to join our team as the Technical Lead. The ideal candidate will have experience designing, developing and maintaining data pipelines combining data infrastructure and AI training infrastructure, to create an end-to-end product.

About the role

  • Architect and design a Python package to help users create scalable pipelines, to go from ‘raw data’ to a trained AI model; including data filtering, data cleaning, data visualization, synthetic data creation, and integration into ML training (incl RL)

  • Work with large datasets to develop both generic models as well as fine-tuned AI models, especially LLMs, using the package

  • Continually improve the package by incorporating state-of-the-art techniques and frameworks

Experence

  • 10+ years experience in data engineering or similar roles, with strong knowledge of designing and implementing complex AI and ML solutions, as a Senior Data Scientist, Machine Learning Engineer, or AI Engineer

  • Strong proficiency in building large-scale data processing pipelines with AI training, familiar with distributed workloads (e.g., multiprocessing, MPI, Ray, Dask, Spark)

  • Experience developing end-to-end pipelines for model training; from handling structured and unstructured data sources to cleaning and creating synthetic data to actual training

  • Experience with AI technologies across the training journey, intimate familiarity with using Pytorch/ Horovod/ TensorflowAbility to take extreme ownership over your work

  • Excellent problem-solving and communication skills

  • Active GitHub contributions are a big plus

  • Built Data pipelines for ML Training (Must, Ideally: Ray)

Go ad-free with Premium ×
About the Job
Full-time
USA
Posted 1 year ago
Check if your resume is a good fit
25/100
Get Full Report
+ 1,284 new jobs added today
30,000+
Remote Jobs

Don't miss out — new listings every hour

Join Premium

Senior Software Engineer/Senior Architect

The job listing has expired. Unfortunately, the hiring company is no longer accepting new applications.

To see similar active jobs please follow this link: Remote Development jobs

We are seeking a talented and experienced Data Engineer to join our team as the Technical Lead. The ideal candidate will have experience designing, developing and maintaining data pipelines combining data infrastructure and AI training infrastructure, to create an end-to-end product.

About the role

  • Architect and design a Python package to help users create scalable pipelines, to go from ‘raw data’ to a trained AI model; including data filtering, data cleaning, data visualization, synthetic data creation, and integration into ML training (incl RL)

  • Work with large datasets to develop both generic models as well as fine-tuned AI models, especially LLMs, using the package

  • Continually improve the package by incorporating state-of-the-art techniques and frameworks

Experence

  • 10+ years experience in data engineering or similar roles, with strong knowledge of designing and implementing complex AI and ML solutions, as a Senior Data Scientist, Machine Learning Engineer, or AI Engineer

  • Strong proficiency in building large-scale data processing pipelines with AI training, familiar with distributed workloads (e.g., multiprocessing, MPI, Ray, Dask, Spark)

  • Experience developing end-to-end pipelines for model training; from handling structured and unstructured data sources to cleaning and creating synthetic data to actual training

  • Experience with AI technologies across the training journey, intimate familiarity with using Pytorch/ Horovod/ TensorflowAbility to take extreme ownership over your work

  • Excellent problem-solving and communication skills

  • Active GitHub contributions are a big plus

  • Built Data pipelines for ML Training (Must, Ideally: Ray)