Apple Inc.

AIML - Machine Learning Engineer, Foundation Model Services

Apple Inc., Seattle, Washington, us, 98127

Seattle, Washington, United States Description

Work alongside the Foundation Model Research team to optimize inference for cutting-edge model architectures. Collaborate closely with product teams to develop production-grade solutions for launching models that serve millions of customers in real time. Build tools to identify bottlenecks in inference across different hardware and use cases. Mentor and guide engineers within the organization. Minimum Qualifications

Demonstrated experience in leading and managing complex, ambiguous projects. Experience with high-throughput services at supercomputing scale. Proficiency in deploying applications on Cloud platforms (AWS, Azure, or equivalent) using Kubernetes and Docker. Knowledge of GPU programming with CUDA and familiarity with machine learning frameworks like PyTorch or TensorFlow. Preferred Qualifications

Experience in building and maintaining systems in modern languages (e.g., Go, Python). Understanding of deep learning architectures such as Transformer models and encoder/decoder models. Familiarity with NVIDIA TensorRT-LLM, vLLM, DeepSpeed, NVIDIA Triton Inference Server. Experience writing custom CUDA kernels using CUDA or OpenAI Triton.

#J-18808-Ljbffr