DSM-H LLC
Typical task breakdown: · Define scalable and secure architectures, frameworks and pipelines for building, deploying and diagnosing production ML applications · Enable users & teams on the ML platform; troubleshoot and debug user issues; maintain user-friendly documentation and training. · Collaborate with internal stakeholders to build a comprehensive MLOps Platform · Design and implement cloud solutions and build MLOps pipelines on cloud solutions (e.g., AWS) · Develop standards and examples to accelerate the productivity of data science teams. · Run code refactoring and optimization, containerization, deployment, versioning, and monitoring of its quality, including data & concept drift · Create way to automate the testing, validation, and deployment of data science models · Provide best practices and execute POC for automated and efficient MLOps at scale Interaction with team: - Working with core team, maybe work with additional teams when needed. - Internal only position - Working with engineers and scrum teams. Work environment: Onsite 2-3 days a week/ no exceptions. Education & Experience Required: - Bachelor's degree with 5+ years experience - Master’s degree with 3+ years experience Required Technical Skills (Required) · 5+ years of experience working with an object-oriented programming language (Python, Golang, Java, C/C++ etc.) · Experience with MLOps frameworks like MLflow, Kubeflow, etc. · Proficiency in programming (Python, R, SQL) · Ability to design and implement cloud solutions and build MLOps pipelines on cloud solutions (e.g., AWS) · Strong understanding of DevOps principles and practices, CI/CD, etc. and tools (Git, GitHub, jFrog Artifactory, Azure DevOps, etc.) · Experience with containerization technologies like Docker and Kubernetes · Strong communication and collaboration skills · Ability to help work with a team to create User Stories and Tasks out of higher-level requirements. Nice to Have: · Ability to create model inference systems with advanced deployment methods that integrate with other MLOps components like MLFlow. · Knowledge of inference systems like Seldon, Kubeflow, etc. · Knowledge of deploying applications and systems in Langfuse or Kubernetes using Helm and Helmfile. · Knowledge of infrastructure orchestration using ClodFormation or Terraform · Exposure to observability tools (such as Evidently AI) Soft Skills (Required) - Someone who takes the initiative on their own - Someone who does not need to be micromanage