Logo
AMD

ML Ops Engineer

AMD, Georgia Center, Vermont, United States

Save Job

Join to apply for the

ML Ops Engineer

role at

AMD Join to apply for the

ML Ops Engineer

role at

AMD This range is provided by AMD. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Base pay range

$143,280.00/yr - $214,920.00/yr WHAT YOU DO AT AMD CHANGES EVERYTHING

We care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world. Our mission is to build great products that accelerate next-generation computing experiences - the building blocks for the data center, artificial intelligence, PCs, gaming and embedded. Underpinning our mission is the AMD culture. We push the limits of innovation to solve the world’s most important challenges. We strive for execution excellence while being direct, humble, collaborative, and inclusive of diverse perspectives.

AMD together we advance_

The Role

AMD is seeking a driven and collaborative MLOps Engineer to join our Engineering Operations team in Atlanta. You will support and optimize large-scale, multi-GPU/CPU ML infrastructure to enable world-class AI and rendering research. Collaborating with teams across North America and Europe, you will design robust, automated pipelines and help push the boundaries of machine learning and high-performance compute in a production data center environment.

The Person

You are a hands-on engineer passionate about both machine learning operations and large-scale infrastructure. You excel at collaborating with researchers and IT specialists, drive automation, and enjoy solving complex technical challenges at the intersection of data science and systems engineering.

Key Responsibilities

Architect, deploy, and maintain high-availability Linux/GPU/CPU server clusters for ML workloads, ensuring optimal performance, security, and scalability. Collaborate cross-functionally with data science, research, and IT teams (across North America and Europe) to streamline ML model training, test, deployment, and monitoring pipelines. Build and automate end-to-end CI/CD workflows for ML (using MLflow, DVC, Kubeflow, Airflow, or similar tools). Configure, monitor, and optimize large-scale NAS and data transfer for sharing of models, datasets, and training results. Proactively monitor infrastructure and application health (using Prometheus, Grafana, or similar), addressing performance bottlenecks, failures, and incidents. Implement robust security, user management, and access protocols in line with international compliance (GDPR, etc.). Document processes, workflows, and troubleshooting guides for global teams; support remote debugging and rapid incident response. Stay abreast of trends in AI infrastructure, MLOps toolchains, and AMD hardware accelerators.

Preferred Experience

Strong programming/scripting background (Python, Bash, or Go), and proven experience with Linux server administration. Practical experience managing GPU/CPU clusters and Kubernetes orchestration. Experience with infrastructure automation (Ansible, Terraform) and CI/CD pipeline design. Familiarity with MLOps stacks (MLflow, DVC, Kubeflow, Flyte, Airflow). Monitoring and troubleshooting distributed workloads for ML/AI, HPC, or rendering. Experience configuring and managing NAS or other distributed file systems for large data. Knowledge of networking (TCP/IP, VLANs, firewalls), data privacy, and compliance. Strong communication, troubleshooting, and documentation skills. Previous exposure to supporting render farms or real-time graphics pipelines is a plus.

Academic Credentials

Computer Science, Computer Engineering, Electrical Engineering, or closely related field.

Location:

Atlanta GA Data Center (Onsite)

Benefits offered are described:

AMD benefits at a glance.

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.

Seniority level

Seniority level Mid-Senior level Employment type

Employment type Full-time Job function

Industries Semiconductor Manufacturing Referrals increase your chances of interviewing at AMD by 2x Get notified about new Operations Engineer jobs in

Georgia, United States . Kingston, GA $70,000.00-$110,000.00 3 weeks ago Oglethorpe, GA $80,000.00-$100,000.00 1 week ago Atlanta, GA $80,000.00-$100,000.00 1 week ago Manufacturing Engineer JN -052025-160793

Manufacturing Engineer (Dalton, Georgia, United States, 30720)

Atlanta, GA $85,000.00-$140,000.00 20 hours ago Augusta, GA $110,000.00-$125,000.00 2 days ago Partner Operations Manager, YouTube (Portuguese)

Atlanta, GA $89,000.00-$127,000.00 1 week ago Manufacturing Engineer Associate- Entry Level

Operations Manager - $120k-$175k + Bonus & Equity (Med Device)

Atlanta, GA $120,000.00-$175,000.00 2 weeks ago We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

#J-18808-Ljbffr