DataRobot
Join to apply for the
Senior Backend Engineer
role at
DataRobot .
DataRobot delivers AI that maximizes impact and minimizes business risk. Our platform and applications integrate into core business processes so teams can develop, deliver, and govern AI at scale. DataRobot empowers practitioners to deliver predictive and generative AI, and enables leaders to secure their AI assets. Organizations worldwide rely on DataRobot for AI that makes sense for their business — today and in the future.
Our AI Compute team is the engine at the heart of DataRobot. We build and operate the foundational computing backbone that powers all of DataRobot's AI products and our customers’ most demanding workloads. Working back‑wards from the needs of data scientists, ML engineers, and application developers, we provide the raw power and orchestration required to train, deploy, and manage agentic AI at any scale. We are the internal equivalent of a hyperscale cloud provider’s core compute service, obsessed with performance, efficiency, and enabling the future of AI.
Responsibilities
Develop, test, and support features of DataRobot.
Create and maintain automated unit tests and functional tests.
Design infrastructure for new features with input from peers.
Build a system that ensures micro‑services are secure, performant, reliable, and can move from idea to production in hours.
Build a system that continuously provides recommendations to right‑size computing resources for Kubernetes, ensuring efficient cloud spending for ourselves and our customers.
Design and architect automated quality platforms to accelerate release cadence from once‑a‑quarter to once‑a‑day without sacrificing performance, security, or reliability.
Work with Product, Legal and Security to ensure the continuous delivery processes you build are compliant and secure.
Ensure pipelines have clear playbooks and can operate 24/7 without ongoing intervention.
Collaborate with architects and platform engineers across R&D to set continuous delivery and performance requirements for all production services.
Partner with product managers to set roadmaps and deliver innovative solutions to delivery and platform engineering challenges.
Manage individual projects and milestones with abundant communication of progress.
Participate in an on‑call rotation to maintain a resilient, observable platform with minimal intervention.
Knowledge, Skills and Abilities
Expert proficiency in Kubernetes architecture and operations, including resource management, scheduling, autoscaling, Gateway API/Ingress, Prometheus, and OpenTelemetry. Experience with other orchestrators such as Nomad or Slurm is a plus.
Experience with GPU clusters, either as a user or administrator, and experience in multi‑node AI/ML deployments.
Passionate about developing products for internal developers.
Strong computer science fundamentals in object‑oriented design, data structures, algorithms, and complexity analysis.
Understanding of design for scalability, performance, and reliability.
Deep experience with automated testing and test‑driven development.
Demonstrable knowledge of software architecture for large systems.
Real‑world experience decoupling monolithic software into smaller reusable components.
Self‑motivated and proactive, able to take ownership and deliver results.
Willingness to learn new technologies.
Effective communication skills.
Operational excellence to continuously define and improve SLA based on customer experience for all software components this team manages.
Minimum Qualifications / Education and Experience
5+ years of professional software engineering experience.
Expert in developing software with Python (4+ years).
Experience designing and operating diverse CI/CD pipelines with Harness.io.
Experience designing and innovating large‑scale horizontal and vertical build, testing, and deployment systems for Kubernetes environments, and familiarity with Helm charts.
Preferred: Golang, Terraform and Terragrunt.
Preferred: Chronosphere expertise.
Preferred: Multi‑cloud experience (AWS, Azure, GCP, and OpenShift).
Nice to Have
Direct experience with modern distributed compute frameworks (e.g., Ray, Dask) and large‑scale job schedulers (e.g., Slurm, Kueue).
CKAD (Certified Kubernetes Application Developer) certification.
Publicly reviewable contributions to interesting development projects.
Agentic AI experience.
Experience working with NVIDIA infrastructure in managing (NIM Operator, NVIDIA Dynamo Operator).
Benefits DataRobot’s benefits package may include, depending on location and local legal requirements:
Medical, Dental & Vision Insurance
Flexible Time Off Program
Paid Holidays
Paid Parental Leave
Global Employee Assistance Program (EAP)
And more!
Operating Principles
Wow Our Customers
Set High Standards
Be Better Than Yesterday
Be Rigorous
Assume Positive Intent
Have the Tough Conversations
Be Better Together
Debate, Decide, Commit
Deliver Results
Overcommunicate
Equal Employment Opportunity DataRobot is proud to be an Equal Employment Opportunity and affirmative action employer. We do not discriminate based upon race, religion, color, national origin, gender (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity, gender expression, age, protected veteran status, disability, or other applicable legally protected characteristics. DataRobot is committed to working with and providing reasonable accommodations to applicants with physical and mental disabilities. Please see our Equal Employment Opportunity poster and supplement for additional information. All applicant data submitted is handled in accordance with our Applicant Privacy Policy.
#J-18808-Ljbffr
Senior Backend Engineer
role at
DataRobot .
DataRobot delivers AI that maximizes impact and minimizes business risk. Our platform and applications integrate into core business processes so teams can develop, deliver, and govern AI at scale. DataRobot empowers practitioners to deliver predictive and generative AI, and enables leaders to secure their AI assets. Organizations worldwide rely on DataRobot for AI that makes sense for their business — today and in the future.
Our AI Compute team is the engine at the heart of DataRobot. We build and operate the foundational computing backbone that powers all of DataRobot's AI products and our customers’ most demanding workloads. Working back‑wards from the needs of data scientists, ML engineers, and application developers, we provide the raw power and orchestration required to train, deploy, and manage agentic AI at any scale. We are the internal equivalent of a hyperscale cloud provider’s core compute service, obsessed with performance, efficiency, and enabling the future of AI.
Responsibilities
Develop, test, and support features of DataRobot.
Create and maintain automated unit tests and functional tests.
Design infrastructure for new features with input from peers.
Build a system that ensures micro‑services are secure, performant, reliable, and can move from idea to production in hours.
Build a system that continuously provides recommendations to right‑size computing resources for Kubernetes, ensuring efficient cloud spending for ourselves and our customers.
Design and architect automated quality platforms to accelerate release cadence from once‑a‑quarter to once‑a‑day without sacrificing performance, security, or reliability.
Work with Product, Legal and Security to ensure the continuous delivery processes you build are compliant and secure.
Ensure pipelines have clear playbooks and can operate 24/7 without ongoing intervention.
Collaborate with architects and platform engineers across R&D to set continuous delivery and performance requirements for all production services.
Partner with product managers to set roadmaps and deliver innovative solutions to delivery and platform engineering challenges.
Manage individual projects and milestones with abundant communication of progress.
Participate in an on‑call rotation to maintain a resilient, observable platform with minimal intervention.
Knowledge, Skills and Abilities
Expert proficiency in Kubernetes architecture and operations, including resource management, scheduling, autoscaling, Gateway API/Ingress, Prometheus, and OpenTelemetry. Experience with other orchestrators such as Nomad or Slurm is a plus.
Experience with GPU clusters, either as a user or administrator, and experience in multi‑node AI/ML deployments.
Passionate about developing products for internal developers.
Strong computer science fundamentals in object‑oriented design, data structures, algorithms, and complexity analysis.
Understanding of design for scalability, performance, and reliability.
Deep experience with automated testing and test‑driven development.
Demonstrable knowledge of software architecture for large systems.
Real‑world experience decoupling monolithic software into smaller reusable components.
Self‑motivated and proactive, able to take ownership and deliver results.
Willingness to learn new technologies.
Effective communication skills.
Operational excellence to continuously define and improve SLA based on customer experience for all software components this team manages.
Minimum Qualifications / Education and Experience
5+ years of professional software engineering experience.
Expert in developing software with Python (4+ years).
Experience designing and operating diverse CI/CD pipelines with Harness.io.
Experience designing and innovating large‑scale horizontal and vertical build, testing, and deployment systems for Kubernetes environments, and familiarity with Helm charts.
Preferred: Golang, Terraform and Terragrunt.
Preferred: Chronosphere expertise.
Preferred: Multi‑cloud experience (AWS, Azure, GCP, and OpenShift).
Nice to Have
Direct experience with modern distributed compute frameworks (e.g., Ray, Dask) and large‑scale job schedulers (e.g., Slurm, Kueue).
CKAD (Certified Kubernetes Application Developer) certification.
Publicly reviewable contributions to interesting development projects.
Agentic AI experience.
Experience working with NVIDIA infrastructure in managing (NIM Operator, NVIDIA Dynamo Operator).
Benefits DataRobot’s benefits package may include, depending on location and local legal requirements:
Medical, Dental & Vision Insurance
Flexible Time Off Program
Paid Holidays
Paid Parental Leave
Global Employee Assistance Program (EAP)
And more!
Operating Principles
Wow Our Customers
Set High Standards
Be Better Than Yesterday
Be Rigorous
Assume Positive Intent
Have the Tough Conversations
Be Better Together
Debate, Decide, Commit
Deliver Results
Overcommunicate
Equal Employment Opportunity DataRobot is proud to be an Equal Employment Opportunity and affirmative action employer. We do not discriminate based upon race, religion, color, national origin, gender (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity, gender expression, age, protected veteran status, disability, or other applicable legally protected characteristics. DataRobot is committed to working with and providing reasonable accommodations to applicants with physical and mental disabilities. Please see our Equal Employment Opportunity poster and supplement for additional information. All applicant data submitted is handled in accordance with our Applicant Privacy Policy.
#J-18808-Ljbffr