TalentBridge
Overview
Job Title: DevOps Engineer – ML Ops (AWS, Kubernetes, Terraform)
Location: Denver, CO (Hybrid) • Department: Technology & Innovation • Contract to hire
Job Summary: A client in the Technology & Innovation department is seeking a skilled and driven DevOps Engineer to join the ML Ops team. This role is critical in enabling scalable, secure, and efficient machine learning infrastructure across the organization. You will work at the intersection of DevOps and MLOps, supporting data scientists and machine learning engineers in deploying, monitoring, and maintaining ML models in production using modern DevOps best practices. You will leverage tools like Kubernetes, Terraform, and CI/CD pipelines in a cloud-native environment (AWS preferred) to streamline workflows, reduce deployment time, and improve operational efficiency for ML-driven initiatives.
Key Responsibilities
Design, build, and maintain CI/CD pipelines for deploying ML models and supporting infrastructure.
Manage and optimize Kubernetes clusters for containerized ML workloads and services.
Implement Infrastructure-as-Code using Terraform to provision and manage AWS cloud infrastructure.
Collaborate closely with ML engineers, data scientists, and platform teams to ensure reliable deployment and monitoring of ML models.
Automate provisioning, configuration management, and deployment processes to ensure repeatability and scalability.
Monitor infrastructure and applications using observability tools; proactively troubleshoot and resolve system issues.
Ensure security, compliance, and cost optimization across the ML Ops infrastructure.
Contribute to internal tooling and platform improvements that empower ML teams to work efficiently.
Required Qualifications
Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent work experience).
3+ years of hands-on experience in DevOps, Site Reliability Engineering, or Cloud Engineering.
Proficient in Kubernetes (EKS preferred) for container orchestration and management.
Strong experience with CI/CD tools (e.g., Jenkins, GitLab CI, CircleCI, ArgoCD).
Deep understanding of Terraform for infrastructure automation and IaC best practices.
Solid experience working in AWS environments (EC2, S3, IAM, Lambda, VPC, CloudFormation, etc.).
Familiarity with monitoring and alerting tools (e.g., Prometheus, Grafana, CloudWatch).
Experience supporting ML pipelines or data science teams is a strong plus.
Scripting skills in Python, Bash, or similar languages.
Strong problem-solving and communication skills; collaborative and team-oriented mindset.
Preferred Qualifications
Experience with ML platforms or MLOps tools (e.g., MLflow, SageMaker, Kubeflow).
Knowledge of GitOps practices and tools like ArgoCD or Flux.
Familiarity with container build tools (Docker, BuildKit).
Experience in highly regulated environments (security and compliance focus).
AWS certifications (e.g., Solutions Architect, DevOps Engineer) are a plus.
Why Join Us
Work on impactful projects at the intersection of AI/ML and infrastructure.
Be part of a growing and innovative ML Ops team.
Access to cutting-edge cloud and ML technologies.
Competitive compensation, excellent benefits, and career development opportunities.
“ TalentBridge employees are eligible for many benefit offerings such as medical, dental, vision, life insurance, short term disability, 401(k) and holiday pay!”
#J-18808-Ljbffr
Location: Denver, CO (Hybrid) • Department: Technology & Innovation • Contract to hire
Job Summary: A client in the Technology & Innovation department is seeking a skilled and driven DevOps Engineer to join the ML Ops team. This role is critical in enabling scalable, secure, and efficient machine learning infrastructure across the organization. You will work at the intersection of DevOps and MLOps, supporting data scientists and machine learning engineers in deploying, monitoring, and maintaining ML models in production using modern DevOps best practices. You will leverage tools like Kubernetes, Terraform, and CI/CD pipelines in a cloud-native environment (AWS preferred) to streamline workflows, reduce deployment time, and improve operational efficiency for ML-driven initiatives.
Key Responsibilities
Design, build, and maintain CI/CD pipelines for deploying ML models and supporting infrastructure.
Manage and optimize Kubernetes clusters for containerized ML workloads and services.
Implement Infrastructure-as-Code using Terraform to provision and manage AWS cloud infrastructure.
Collaborate closely with ML engineers, data scientists, and platform teams to ensure reliable deployment and monitoring of ML models.
Automate provisioning, configuration management, and deployment processes to ensure repeatability and scalability.
Monitor infrastructure and applications using observability tools; proactively troubleshoot and resolve system issues.
Ensure security, compliance, and cost optimization across the ML Ops infrastructure.
Contribute to internal tooling and platform improvements that empower ML teams to work efficiently.
Required Qualifications
Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent work experience).
3+ years of hands-on experience in DevOps, Site Reliability Engineering, or Cloud Engineering.
Proficient in Kubernetes (EKS preferred) for container orchestration and management.
Strong experience with CI/CD tools (e.g., Jenkins, GitLab CI, CircleCI, ArgoCD).
Deep understanding of Terraform for infrastructure automation and IaC best practices.
Solid experience working in AWS environments (EC2, S3, IAM, Lambda, VPC, CloudFormation, etc.).
Familiarity with monitoring and alerting tools (e.g., Prometheus, Grafana, CloudWatch).
Experience supporting ML pipelines or data science teams is a strong plus.
Scripting skills in Python, Bash, or similar languages.
Strong problem-solving and communication skills; collaborative and team-oriented mindset.
Preferred Qualifications
Experience with ML platforms or MLOps tools (e.g., MLflow, SageMaker, Kubeflow).
Knowledge of GitOps practices and tools like ArgoCD or Flux.
Familiarity with container build tools (Docker, BuildKit).
Experience in highly regulated environments (security and compliance focus).
AWS certifications (e.g., Solutions Architect, DevOps Engineer) are a plus.
Why Join Us
Work on impactful projects at the intersection of AI/ML and infrastructure.
Be part of a growing and innovative ML Ops team.
Access to cutting-edge cloud and ML technologies.
Competitive compensation, excellent benefits, and career development opportunities.
“ TalentBridge employees are eligible for many benefit offerings such as medical, dental, vision, life insurance, short term disability, 401(k) and holiday pay!”
#J-18808-Ljbffr