C the Signs
Position Summary
The Machine Learning Engineer will be responsible for the end-to-end development and deployment of Large language and machine learning models, with a primary focus on data preprocessing, model training, and fine-tuning using large-scale healthcare datasets. This role requires a strong understanding of Large language models, machine learning principles, data engineering, and experience working with sensitive healthcare data.
Key Responsibilities
Data Preprocessing: Clean, transform, and prepare large, complex healthcare datasets for machine learning model development. This includes handling missing values, outlier detection, feature engineering, and data normalization. Identify, collect, and curate relevant, industry-specific datasets for model retraining. Format data appropriately for the chosen LLM and training pipeline.
Model Training & Fine-Tuning: Design, train, and fine-tune various LLMs on extensive healthcare data to solve specific clinical or operational problems. Set up and manage the training environment, including GPU instances and required software. Train and fine-tune pre-trained LLMs on the custom dataset to achieve specific goals. Experiment with and fine-tune hyperparameters such as learning rate, batch size, and training epochs to optimize model performance. Integration of structured + unstructured data (multi-modal/multi-input models).
Model Evaluation & Optimization: Evaluate model performance using appropriate metrics, identify areas for improvement, and implement optimization strategies.
Pipeline Development: Develop and maintain robust and scalable data and ML pipelines for model training, inference, and deployment.
Collaboration: Work closely with data scientists, clinicians, and software engineers to understand requirements, integrate models into production systems, and ensure data privacy and security compliance.
Research & Development: Stay up-to-date with the latest advancements in machine learning and healthcare AI, and explore new technologies and methodologies to enhance our solutions.
Documentation: Maintain clear and comprehensive documentation of models, data pipelines, and experimental results.
Required Qualifications
Education: Bachelor's or Master's degree in Computer Science, Machine Learning, Artificial Intelligence, or a related quantitative field.
Experience:
5+ years of experience in Machine Learning Engineering or a similar role.
Proven experience with large-scale data preprocessing, LLM/model training, and fine-tuning.
Experience with distributed training (PyTorch Distributed, DeepSpeed, Ray, Hugging Face Accelerate).
Experience with GPU/TPU optimization, memory management for large language models.
Experience working with healthcare data is highly desirable.
Technical Skills:
Proficiency in Python and relevant ML libraries (e.g., TensorFlow, PyTorch, Scikit-learn, Pandas, NumPy).
Strong understanding of various machine learning algorithms, Large Language Models, and deep learning architectures.
Experience with cloud platforms (e.g., GCP, AWS) and distributed computing frameworks (e.g., Spark) is a plus.
Familiarity with MLOps practices and tools.
Soft Skills:
Excellent problem-solving and analytical skills.
Strong communication and collaboration abilities.
Ability to work independently and as part of a team in a fast-paced environment.
Work Authorization:
Must be a US Citizen, Green Card holder, or currently in the US with a valid H1B visa.
Why Join Us?
Joining C the Signs is not just about building AI; it’s about shaping the future of healthcare. If you are a technical leader with an unshakable belief in the power of AI to save lives and the ability to make it happen at scale, this is your opportunity to create a tangible, global impact.
Competitive salary and benefits package.
Flexible working arrangements (remote or hybrid options available).
The opportunity to work on life-changing AI technology that directly impacts patient outcomes.
Join a team that combines cutting‑edge innovation with a mission to save lives and improve health equity.
Continuous learning opportunities with access to the latest tools and advancements in AI and healthcare.
#J-18808-Ljbffr
Key Responsibilities
Data Preprocessing: Clean, transform, and prepare large, complex healthcare datasets for machine learning model development. This includes handling missing values, outlier detection, feature engineering, and data normalization. Identify, collect, and curate relevant, industry-specific datasets for model retraining. Format data appropriately for the chosen LLM and training pipeline.
Model Training & Fine-Tuning: Design, train, and fine-tune various LLMs on extensive healthcare data to solve specific clinical or operational problems. Set up and manage the training environment, including GPU instances and required software. Train and fine-tune pre-trained LLMs on the custom dataset to achieve specific goals. Experiment with and fine-tune hyperparameters such as learning rate, batch size, and training epochs to optimize model performance. Integration of structured + unstructured data (multi-modal/multi-input models).
Model Evaluation & Optimization: Evaluate model performance using appropriate metrics, identify areas for improvement, and implement optimization strategies.
Pipeline Development: Develop and maintain robust and scalable data and ML pipelines for model training, inference, and deployment.
Collaboration: Work closely with data scientists, clinicians, and software engineers to understand requirements, integrate models into production systems, and ensure data privacy and security compliance.
Research & Development: Stay up-to-date with the latest advancements in machine learning and healthcare AI, and explore new technologies and methodologies to enhance our solutions.
Documentation: Maintain clear and comprehensive documentation of models, data pipelines, and experimental results.
Required Qualifications
Education: Bachelor's or Master's degree in Computer Science, Machine Learning, Artificial Intelligence, or a related quantitative field.
Experience:
5+ years of experience in Machine Learning Engineering or a similar role.
Proven experience with large-scale data preprocessing, LLM/model training, and fine-tuning.
Experience with distributed training (PyTorch Distributed, DeepSpeed, Ray, Hugging Face Accelerate).
Experience with GPU/TPU optimization, memory management for large language models.
Experience working with healthcare data is highly desirable.
Technical Skills:
Proficiency in Python and relevant ML libraries (e.g., TensorFlow, PyTorch, Scikit-learn, Pandas, NumPy).
Strong understanding of various machine learning algorithms, Large Language Models, and deep learning architectures.
Experience with cloud platforms (e.g., GCP, AWS) and distributed computing frameworks (e.g., Spark) is a plus.
Familiarity with MLOps practices and tools.
Soft Skills:
Excellent problem-solving and analytical skills.
Strong communication and collaboration abilities.
Ability to work independently and as part of a team in a fast-paced environment.
Work Authorization:
Must be a US Citizen, Green Card holder, or currently in the US with a valid H1B visa.
Why Join Us?
Joining C the Signs is not just about building AI; it’s about shaping the future of healthcare. If you are a technical leader with an unshakable belief in the power of AI to save lives and the ability to make it happen at scale, this is your opportunity to create a tangible, global impact.
Competitive salary and benefits package.
Flexible working arrangements (remote or hybrid options available).
The opportunity to work on life-changing AI technology that directly impacts patient outcomes.
Join a team that combines cutting‑edge innovation with a mission to save lives and improve health equity.
Continuous learning opportunities with access to the latest tools and advancements in AI and healthcare.
#J-18808-Ljbffr