Logo
DigitalOcean

Senior AI/ML Engineer II

DigitalOcean, Austin, Texas, us, 78716

Save Job

Overview

Join to apply for the

Senior AI/ML Engineer II

role at

DigitalOcean . Dive in and do the best work of your career at DigitalOcean. We are building the next generation of agentic applications on the GradientAI platform, where multi-agent systems of LLM-powered agents collaborate, make decisions, and adapt at scale. You will be part of the team designing robust, scalable, and safe agent workflows that empower developers to build sophisticated AI-driven systems with confidence. If you have a growth mindset, think big and bold, and are energized by a fast-paced environment, you’ll find your place here. We value winning together while learning, having fun, and making a profound difference for the dreamers and builders in the world. Responsibilities

Architect and deliver production-grade agentic systems: multi-agent orchestration, workflow management, state/memory handling, and runtime governance. Design and orchestrate modular, LLM-powered agents (e.g., Planner, Tool Executor, QA, Validator) using scalable orchestration patterns (sequential, router, parallel, map-reduce), with clear handoff protocols, shared memory, and structured communication. Define and enforce guardrails and governance: prompt sanitization, access control, audit trails, threat modeling, and strategies for injection defense, hallucination control, misuse prevention, and compliance. Establish evaluation and monitoring methods for multi-agent systems: accuracy, safety, cost, and latency—leveraging observability practices (logs, telemetry, tracing, capturing intermediate outputs) and feedback loops to continuously refine performance. Build fine-tuning and deployment pipelines: supervised fine-tuning, inference optimization, post-deployment updates, and scaling hardened systems with retries, error handling, and fairness checks. Rapidly define and deliver MCPs: identify minimal agent roles and orchestration logic, validate quickly, and expand iteratively into robust multi-agent applications. Integrate seamlessly with the GradientAI platform: ensuring agents leverage DO services (inference, KBs, Functions, storage, networking) for scale, reliability, and cost-efficiency. Apply strong software engineering practices: testing, CI/CD, code quality, scalable architectures, and distributed system design. Collaborate cross-functionally with product managers, infra teams, design and UX, and other engineers to ship features that developers adopt and trust. Participate and support in operational excellence. Independently ship product features from planning to launch to maintenance with high autonomy. Collaborate with other engineers to find elegant architectures and solutions. What You’ll Add To DigitalOcean

Proven experience in software development at scale, with strong foundations in distributed systems, system design, and cloud-native engineering. Hands-on experience in shipping AI/ML systems into production. Drive observability, guardrails, and evaluation best practices for multi-agent workflows, ensuring visibility, safety, and continuous improvement. Partner closely with UX and design teams to ensure agentic features deliver simple, intuitive, and developer-first experiences. Contribute to platform-level improvements (component libraries, reusable tools, design systems) that accelerate adoption and developer productivity. Mentor and support teammates in applying guardrails, governance, and orchestration patterns consistently across projects. Ability to balance engineering trade-offs (reliability, latency, cost) with business outcomes. 5+ years of relevant industry experience in software engineering and deploying agentic AI systems in production within high-growth environments. Why You’ll Like Working For DigitalOcean

We innovate with purpose. You’ll be part of a cutting-edge technology company with an upward trajectory, aiming to simplify cloud and AI so builders can create software that changes the world. You’ll be a proactive, owner-minded contributor with a bias for action and responsibility for customers, products, and decisions. We prioritize career development. We offer opportunities to grow, support attendance at conferences and training, and provide access to LinkedIn Learning for continued growth. We care about well-being. We provide a competitive benefits package and flexible time off where allowed by local regulations and preferences. We reward our employees. Salary range for this position is between $146,600 – $183,300, with potential bonuses and equity compensation, including an Employee Stock Purchase Program. We value diversity and inclusion. We are an equal-opportunity employer and do not discriminate on various protected characteristics. This is a remote role.

#J-18808-Ljbffr