AMD

AI Performance Architect, Training and Inference

AMD, Santa Clara

Overview

AI Performance Architect, Training and Inference - Santa Clara, CA area

This role is with AMD. Base pay range is provided below; actual pay is based on skills and experience. Talk with your recruiter to learn more.

Base pay range

$226,400.00/yr - $339,600.00/yr

The Role

Join a world-class team enabling software for world-class datacenters and the most powerful supercomputers. AMD is seeking talented and motivated AI Software Engineers to push the boundaries of efficiency and performance, optimizing the software ecosystem for the next generation of GPU accelerators. You will work to combine the newest hardware with the industry’s latest applications, libraries, frameworks, and SDKs to solve complex AI challenges. Minimum 7 years of experience required.

Responsibilities

Enable DL models, libraries, and applications for Instinct GPUs in on-prem and cloud environments.
Analyze and optimize the performance of AI software and understand hardware bottlenecks to hit near roofline performance.
Collaborate with a team of Software Engineers to deliver high-performance AI software products.

Key Qualifications

Strong programming skills in C++ and Python.
Experience with at least one major DL framework (e.g., PyTorch or TensorFlow) in inference, fine-tuning, and/or training.
MS with related experience or PhD in Computer Science/Engineering or related field.
Experience developing software and system-level performance optimizations with solid GPU architecture understanding (a plus).
Experience with open-source software development and contributing to communities (a plus).
Publications in reputed peer-reviewed ML conferences/journals (a plus).
Excellent analytical and problem-solving skills for root-causing and addressing performance issues.
Ability to work independently and within a team; self-motivated with a collaborative mindset.
Willingness to learn tools and methods to improve AMD software quality and timeliness.

Preferred Experience

Expertise in profiling tools across the AI software stack (TorchProfiler, ROCm Profiler, VTune, Nsight).
Experience implementing and optimizing parallel methods on GPUs (NCCL/RCCL, OpenMP, MPI).
Performance analysis skills for CPU and GPU.
Experience with Singularity, Docker, and/or Kubernetes.
Ability to communicate status and project insights clearly to leadership.

Location

Santa Clara, CA area

Benefits

Benefits offered are described: AMD benefits at a glance. AMD is an equal opportunity, inclusive employer and will consider all applicants without regard to protected characteristics.

#J-18808-Ljbffr