NODA AI

MLOps Engineer

NODA AI, Austin, Texas, us, 78716

Overview

NODA AI Location:

Austin, TX (Hybrid on-site, with up to [xx]% travel) Clearance Requirement:

U.S. Citizen with the ability to obtain a security clearance About NODA

NODA is a veteran-owned, venture-backed technology company that is transforming how unmanned systems collaborate in complex, mission-critical environments. We are developing next-generation solutions that enable the autonomous orchestration of heterogeneous unmanned systems across air, sea, land, and space with vital applications in the defense, intelligence, and commercial sectors. Our MLOps Engineers ensure that AI and machine learning models powering autonomous decision-making are deployed, monitored, and optimized reliably across diverse operational environments. They bridge the gap between AI research and production deployment, enabling continuous improvement of our agentic AI systems in real-world missions. Joining NODA means working on meaningful technology that pushes the boundaries of autonomy alongside a team that thrives on innovation, rapid iteration, and collaboration. The Role

We are seeking an MLOps Engineer to own the complete lifecycle of machine learning models that power our autonomous vehicle orchestration platform. You will build the infrastructure, pipelines, and monitoring systems that enable our AI/ML team to deploy, validate, and continuously improve AI agents and reasoning systems operating in mission-critical environments. This role focuses specifically on ML model operations rather than general infrastructure, working closely with our AI/ML and Platform Engineering teams to ensure reliable, scalable, and secure deployment of AI systems across cloud, edge, and tactical computing environments. Key Responsibilities

Design and implement automated training pipelines for LLMs, agent frameworks, and reasoning models used in autonomous orchestration Build model versioning, experiment tracking, and artifact management systems for AI model development workflows Develop automated model validation and testing frameworks, including simulation-based evaluation of agent behaviors Implement A/B testing infrastructure for comparing AI reasoning strategies and agent performance in operational scenarios Create model monitoring and observability systems to track performance, drift, and reliability of deployed AI systems Optimize model deployment for edge computing environments, including quantization, pruning, and inference acceleration Build automated retraining pipelines that incorporate operational feedback and performance data from field deployments Implement model governance and compliance frameworks, including audit trails, performance documentation, and safety validation Design feature stores and data pipelines that prepare operational sensor and mission data for AI model consumption Collaborate with AI/ML engineers to establish best practices for model development, testing, and production deployment Support rollback and canary deployment strategies for safely updating AI models in mission-critical environments Ensure secure model deployment practices, including model encryption, access controls, and adversarial robustness validation Required Qualifications

4+ years of experience in MLOps, ML platform engineering, or production machine learning systems Strong proficiency in Python and ML frameworks (PyTorch, TensorFlow, Hugging Face, MLflow, or similar) Experience with container orchestration (Docker, Kubernetes) specifically for ML workloads Knowledge of model serving frameworks (TorchServe, TensorFlow Serving, Triton, or cloud ML endpoints) Understanding of ML experiment tracking, model versioning, and artifact management systems Experience with cloud ML platforms (AWS SageMaker, Azure ML, or Google AI Platform) Proficiency with data pipeline tools (Apache Airflow, Prefect, or similar workflow orchestration) Knowledge of model monitoring, performance tracking, and automated alerting systems Understanding of CI/CD practices specifically applied to ML model deployment U.S. Citizenship with ability to obtain a security clearance Preferred Qualifications

Experience with large language model deployment, fine-tuning, and optimization techniques Background in edge computing and model optimization for resource-constrained environments Familiarity with A/B testing frameworks and statistical evaluation of model performance Knowledge of feature stores, data versioning, and ML data management systems Experience with model security, adversarial testing, and ML system hardening Understanding of distributed training and multi-GPU model development workflows Background in real-time inference systems and low-latency model serving Experience with model explainability tools and responsible AI deployment practices Previous work in defense, aerospace, or safety-critical ML applications Contributions to open-source MLOps tools or machine learning platforms Skills & Attributes

Experience with large language model deployment, fine-tuning, and optimization techniques Background in edge computing and model optimization for resource-constrained environments Familiarity with A/B testing frameworks and statistical evaluation of model performance Knowledge of feature stores, data versioning, and ML data management systems Experience with model security, adversarial testing, and ML system hardening Understanding of distributed training and multi-GPU model development workflows Background in real-time inference systems and low-latency model serving Experience with model explainability tools and responsible AI deployment practices Previous work in defense, aerospace, or safety-critical ML applications Contributions to open-source MLOps tools or machine learning platforms What we offer

Hybrid work environment Competitive pay Flexible time off Generous PTO policy Federal holidays Generous health, dental, and vision benefits Free OneMedical membership Growth Path at NODA

Final leveling will be determined at the offer stage based on scope demonstrated in interviews, prior impact, and calibration to NODAs career ladder. Senior MLOps Engineer — Build and harden the training/evaluation pipeline, model registry, and promotion flow; introduce simulation-in-loop gates and runtime SLOs; make rollouts safe and repeatable for field deployments. Staff MLOps Engineer — Own the end-to-end ML operations platform (data→features→train→serve→monitor→retrain) across teams; standardize drift detection and rollback playbooks; ship the edge deployment toolchain that eliminates a recurring class of model incidents. Principal ML Platform Architect — Define the company-wide model lifecycle vision for multi-domain autonomy: evaluation frameworks, safety gates, lineage/governance, and cost/performance strategy; set cross-team standards and interfaces adopted by the org. ML Reliability & Governance Lead — Partner with Product, Field Ops, and Security/Compliance to deliver audit-ready model operations and mission assurance; scale incident learning, safety cases, and readiness We are an Equal Opportunity Employer and welcome applicants from all backgrounds. All qualified individuals will receive consideration for employment regardless of race, age, color, religion, sex, national origin, disability, or protected veteran status.

#J-18808-Ljbffr