Logo
NVIDIA

Senior DGX Cloud Software Engineer - Infrastructure Automation and Distributed S

NVIDIA, Santa Clara, California, us, 95053

Save Job

Senior DGX Cloud Software Engineer - Infrastructure Automation and Distributed Systems

We are seeking Software Engineers with experience building and operating private and public clouds at production scale. Join the DGX Cloud team to support AI training and inference development by building platforms, tools, and services that ensure the operational capacity of our bare-metal, accelerated compute infrastructure, and embed reliability best practices into the DGX Cloud ecosystem. What you’ll be doing: Design, build, and operate cloud infrastructure services to meet business goals, including integrations, migrations, updates, and decommissions. Define internal service level objectives and error budgets as part of our observability strategy. Automate repetitive tasks to improve efficiency where ROI justifies automation. Engage in blameless incident prevention and response as part of an on-call team. Advise peer teams on system design best practices. Contribute to a culture of values-driven communication, introspection, and self-organization. What we need to see: Proficiency in Python or Go. BS in Computer Science or related technical field, or equivalent experience. 5+ years in infrastructure and fleet management engineering. Experience developing automation tools for large-scale cloud systems in production. Proven ability to initiate projects, collaborate, and influence others. Deep knowledge of Linux, Slurm, Kubernetes, Storage, and Networking. Ways to stand out: Strong problem-solving skills, clear communication, ownership, and results-driven mindset. Experience with BMaaS, multi-cloud infrastructure, or teaching reliability engineering. Knowledge of accelerated compute technologies like BlueField Networking, Infiniband, NVMesh, NCCL, and security collaboration experience. AI/ML experience is a plus but not required. NVIDIA leads in AI, HPC, and Visualization. Our inventions, like the GPU, drive innovation across industries. We seek creative, autonomous individuals to accelerate AI's future. The salary range is $144,000 - $270,250, based on location, experience, and market rates. Compensation includes equity and benefits. NVIDIA accepts applications continuously and is committed to diversity and equal opportunity in employment. About NVIDIA

Nvidia Corporation, based in Santa Clara, California, is a leading multinational technology company specializing in AI, HPC, and visualization technologies.

#J-18808-Ljbffr