Synergis

Senior Machine Learning Infrastructure Engineer

Synergis, San Mateo, California, United States, 94409

Senior Machine Learning Infrastructure Engineer Direct Hire $195K-$295K About the Team: The ML Inference Platform is part of the AI Compute Platforms organization within Infrastructure Platforms. Our team is dedicated to building a cloud-agnostic, reliable, and cost-efficient platform that empowers our client's AI ambitions. We are at the forefront of supporting teams that are developing autonomous vehicles (L3/L4/L5) and other AI-driven products. Our platform enables swift innovation and feature development by focusing on ML-centric use cases. It supports state-of-the-art machine learning models for both experimental and bulk inference, emphasizing performance, availability, concurrency, and scalability while maximizing GPU utilization across diverse platforms (B200, H100, A100, etc.) while ensuring reliability and cost efficiency. About the Role: We are looking for a Senior Machine Learning Infrastructure Engineer to contribute to the creation and scaling of robust compute platforms for machine learning workflows. In this impactful role, you will collaborate closely with machine learning engineers and researchers to optimize model serving and inference in production for a variety of workflows including data mining, labeling, model distillation, and simulations. Your expertise will be pivotal in shaping the architecture, roadmap, and user experience of a comprehensive ML inference service that accommodates real-time, batch, and experimental inference needs. The ideal candidate will possess experience in designing distributed systems specifically for ML, with strong problem-solving capabilities and a proactive approach to enhancing platform usability and reliability. What You'll Be Doing: - Design and implement core backend components for the platform. Work collaboratively with ML engineers and researchers to translate critical workflows into platform requirements and deliver incremental value. Lead technical decision-making regarding model serving methods, orchestration, caching, model versioning, and auto-scaling strategies. Drive the development of monitoring, observability, and metrics to ensure the reliability, performance, and optimization of inference services. Research and integrate cutting-edge model serving frameworks, hardware accelerators, and distributed computing techniques. Lead major technical initiatives across the ML ecosystem. Elevate the engineering standards through technical leadership and the establishment of best practices. Contribute to open-source projects and represent the organization in relevant communities. Minimum Requirements: - 8+ years of industry experience focusing on machine learning systems or high-performance backend development. Proficiency in Go, Python, C++, or other relevant programming languages. In-depth knowledge of ML inference and model serving frameworks (e.g., Triton, Rayserve, vLLM). Strong communication skills and a proven record of driving cross-functional initiatives. Experience with cloud platforms such as GCP, Azure, or AWS. Ability to excel in a dynamic environment with shifting priorities. Preferred Qualifications: - Hands-on experience in building ML infrastructure platforms focused on model serving/inference. Experience in developing interfaces, APIs, and clients for ML workflows. Familiarity with the Ray framework and/or vLLM. Experience with distributed systems and large-scale data processing. Knowledge of telemetry and feedback loops to guide product improvements. Understanding of hardware acceleration (GPUs) and optimizations for inference workloads. Contributions to open-source ML serving frameworks. The compensation range for this position is $195,000 to $295,000 (dependent on various factors including experience and location). *Note: Disclosure as required by applicable pay transparency laws. Synergis is proud to be an Equal Opportunity Employer. We value diversity and do not discriminate on the basis of any status protected by applicable law. For consideration, please forward your resume to dwicks@synergishr.com If you require assistance or an accommodation in the application or employment process, please contact us at dwicks@synergishr.com. Qualified applicants with arrest or conviction records will be considered for employment in accordance with applicable state and local laws. Synergis is a workforce solutions partner serving businesses and job seekers nationwide. Our mission is to build IT ecosystems enabling growth and innovation. Learn more about Synergis at www.synergishr.com.