DigitalOcean

Principal Engineer, Inference Service

DigitalOcean, Austin, Texas, us, 78716

Overview

Dive in and do the best work of your career at DigitalOcean. We’re seeking an experienced Principal Software Engineer to drive the design, development and scaling of our Large Language Model (LLM) inference services. This team will build a new product that brings DigitalOcean Simplicity to LLM hosting, serving, and optimization. You will build systems for inference serving of popular open source / open weights LLMs as well as custom models, develop techniques to optimize these models and scale the platform to millions of users globally. Base pay range: $206,000.00/yr - $250,000.00/yr. This role supports remote work and is based in a flexible, fast-paced environment. Responsibilities

Design and implement an inference platform for serving large language models optimized for various GPU platforms. Lead complex AI and cloud engineering projects through the full product development lifecycle: ideation, product definition, experimentation, prototyping, development, testing, release, and operations. Optimize runtime and infrastructure layers of the inference stack for best model performance. Build native cross-platform inference support across NVIDIA and AMD GPUs for multiple model architectures. Contribute to open source inference engines to improve performance on DigitalOcean cloud. Develop tooling and observability to monitor system health and enable auto tuning. Build benchmarking frameworks to test model serving performance and guide tuning efforts. Mentor engineers on inference systems, GPU infrastructure, and distributed systems best practices. Qualifications

10+ years of software engineering experience, including 2+ years building AI/ML technologies (ideally related to LLM hosting and inference). Strong interest in distributed systems, AI/ML, and large-scale cloud implementations. Deep expertise in cloud computing platforms and modern AI/ML technologies. Experience with modern LLMs, hosting, serving, and optimization. Experience with one or more inference engines (e.g., vLLM, SGLang, Modular Max) is a bonus. Experience researching, evaluating, and building with open source technologies. Proficiency in Python and Go; familiarity with IaC tools like Terraform or Ansible is a plus. Strong ownership mindset and ability to drive value for customers. Ability to collaborate across engineering, operations, support, and product teams. Experience coordinating with partner teams across time zones and geographies. Familiarity with end-to-end quality practices. Excellent communication and mentoring skills for junior engineers. Why You’ll Like Working for DigitalOcean

We innovate with purpose and value a growth mindset, bold thinking, and customer responsibility. Career development support with opportunities for conferences, training, and LinkedIn Learning resources. Competitive benefits and a flexible time off policy; benefits vary by location. Salary and equity compensation, with potential bonuses and an Employee Stock Purchase Program. We are an equal-opportunity employer and value diversity and inclusion. This is a remote role. Job Function & Seniority

Seniority level: Mid-Senior level Employment type: Full-time Job function: Engineering and Information Technology Industries: Internet Publishing

#J-18808-Ljbffr