Anduril Industries, Inc.
Senior Site Reliability Engineer
Anduril Industries, Inc., Costa Mesa, California, United States, 92626
Anduril Industries
is a defense technology company dedicated to transforming U.S. and allied military capabilities with advanced technology. Our family of systems is powered by Lattice OS, an AI‑powered operating system that turns thousands of data streams into a real‑time, 3D command and control center.
About the Team Anduril Maritime delivers platforms, systems, and integrated effects in the maritime domain. Our autonomous vehicles (sub‑surface and surface) are the cornerstone of these capabilities, and we continually push the boundaries of endurance, autonomy, and mission capability. The Maritime team develops and maintains core products and payloads, adapting them to serve a wide variety of defense, intelligence, and commercial customers in the U.S. and international markets.
About the Job As a Senior Site Reliability Engineer on the Maritime Digital Shipbuilding team, you will build and operate the infrastructure that keeps our digital production systems running at full speed. You will develop and manage CI/CD pipelines, automate infrastructure with code, and deploy applications and machine learning models across cloud and edge environments with security, traceability, and reliability in mind.
What You’ll Do
Build and manage CI/CD pipelines using GitHub Actions and Jfrog Artifactory to integrate and deploy machine learning models and applications.
Provision and manage infrastructure on cloud platforms (Azure, AWS, GCP) using Terraform and Ansible.
Containerize applications with Docker and orchestrate with Kubernetes for reliable deployment and scaling.
Deploy machine learning models with model registries and feature stores (MLflow, Kubeflow), managing batch and real‑time inference pipelines.
Establish monitoring and logging with ELK Stack, Prometheus, and Grafana for smooth operation of deployment environments.
Collaborate across development, data science, and operations teams to ensure efficient deployment of machine learning models.
Accelerate high‑performance computing tasks using CUDA and OpenCL to process large datasets and simulations.
Required Qualifications
Advanced proficiency in Python for scripting and integration.
Experience with CI/CD tools (GitHub Actions, Jfrog Artifactory, Git).
Proficiency with IaC tools (Terraform, Ansible).
Experience with cloud platforms (Azure, AWS, GCP).
Proficiency in containerization (Docker) and orchestration (Kubernetes).
Knowledge of model registries and feature stores (MLflow, Kubeflow).
Experience with logging and monitoring tools (ELK Stack, Prometheus, Grafana).
Understanding of parallel computing frameworks (CUDA, OpenCL).
Strong collaboration skills and proficiency with JIRA and Confluence.
Eligibility to obtain and maintain an active U.S. Secret security clearance.
Preferred Qualifications
Previous experience in a manufacturing or industrial setting.
Familiarity with observability concepts and tools.
Knowledge of security best practices for DevOps and MLOps.
Salary and Benefits US Salary Range:
$166,000 - $220,000 USD.
Comprehensive medical, dental, and vision plans at little to no cost.
Life and disability insurance for all employees.
Highly competitive PTO with a holiday hiatus in December; caregiver & wellness leave available.
Family planning support: fertility treatments, adoption, gestational carrier resources.
24/7 free mental health resources, therapy, life coaching, and legal/financial assistance.
Annual professional development reimbursement.
Company‑funded commuter benefits based on region.
Retirement savings plan: 401(k) with employer match (US), pension plan (UK & IE), superannuation (AUS).
Equity grants included in most full‑time offers.
Equal Employment Opportunity Anduril Industries’s Equal Employment Opportunity policy
commits us to not discriminate on the basis of any protected group status under any applicable law.
Contact Interested in building your career at Anduril Industries? Apply now or submit your resume to
careers@andurilindustries.com .
#J-18808-Ljbffr
is a defense technology company dedicated to transforming U.S. and allied military capabilities with advanced technology. Our family of systems is powered by Lattice OS, an AI‑powered operating system that turns thousands of data streams into a real‑time, 3D command and control center.
About the Team Anduril Maritime delivers platforms, systems, and integrated effects in the maritime domain. Our autonomous vehicles (sub‑surface and surface) are the cornerstone of these capabilities, and we continually push the boundaries of endurance, autonomy, and mission capability. The Maritime team develops and maintains core products and payloads, adapting them to serve a wide variety of defense, intelligence, and commercial customers in the U.S. and international markets.
About the Job As a Senior Site Reliability Engineer on the Maritime Digital Shipbuilding team, you will build and operate the infrastructure that keeps our digital production systems running at full speed. You will develop and manage CI/CD pipelines, automate infrastructure with code, and deploy applications and machine learning models across cloud and edge environments with security, traceability, and reliability in mind.
What You’ll Do
Build and manage CI/CD pipelines using GitHub Actions and Jfrog Artifactory to integrate and deploy machine learning models and applications.
Provision and manage infrastructure on cloud platforms (Azure, AWS, GCP) using Terraform and Ansible.
Containerize applications with Docker and orchestrate with Kubernetes for reliable deployment and scaling.
Deploy machine learning models with model registries and feature stores (MLflow, Kubeflow), managing batch and real‑time inference pipelines.
Establish monitoring and logging with ELK Stack, Prometheus, and Grafana for smooth operation of deployment environments.
Collaborate across development, data science, and operations teams to ensure efficient deployment of machine learning models.
Accelerate high‑performance computing tasks using CUDA and OpenCL to process large datasets and simulations.
Required Qualifications
Advanced proficiency in Python for scripting and integration.
Experience with CI/CD tools (GitHub Actions, Jfrog Artifactory, Git).
Proficiency with IaC tools (Terraform, Ansible).
Experience with cloud platforms (Azure, AWS, GCP).
Proficiency in containerization (Docker) and orchestration (Kubernetes).
Knowledge of model registries and feature stores (MLflow, Kubeflow).
Experience with logging and monitoring tools (ELK Stack, Prometheus, Grafana).
Understanding of parallel computing frameworks (CUDA, OpenCL).
Strong collaboration skills and proficiency with JIRA and Confluence.
Eligibility to obtain and maintain an active U.S. Secret security clearance.
Preferred Qualifications
Previous experience in a manufacturing or industrial setting.
Familiarity with observability concepts and tools.
Knowledge of security best practices for DevOps and MLOps.
Salary and Benefits US Salary Range:
$166,000 - $220,000 USD.
Comprehensive medical, dental, and vision plans at little to no cost.
Life and disability insurance for all employees.
Highly competitive PTO with a holiday hiatus in December; caregiver & wellness leave available.
Family planning support: fertility treatments, adoption, gestational carrier resources.
24/7 free mental health resources, therapy, life coaching, and legal/financial assistance.
Annual professional development reimbursement.
Company‑funded commuter benefits based on region.
Retirement savings plan: 401(k) with employer match (US), pension plan (UK & IE), superannuation (AUS).
Equity grants included in most full‑time offers.
Equal Employment Opportunity Anduril Industries’s Equal Employment Opportunity policy
commits us to not discriminate on the basis of any protected group status under any applicable law.
Contact Interested in building your career at Anduril Industries? Apply now or submit your resume to
careers@andurilindustries.com .
#J-18808-Ljbffr