Logo
ByteDance

Tech Lead, AML Inference San Jose Regular

ByteDance, San Jose, California, United States, 95199

Save Job

Overview

Join us as we work together to inspire creativity and enrich life around the globe. About the Team: The mission of our Applied Machine Learning (AML) team is to push the next-generation AI infrastructure and recommendation platform for ads ranking, search ranking, live streaming, and e-commerce. We drive substantial impact across ByteDances core businesses by building world-class ML platforms and systems. We are seeking a Tech Lead, AML Inference to oversee the development and execution of ByteDances inference infrastructure. This role will lead and mentor a team of Machine Learning Engineers focused on inference, ensuring reliability, scalability, and performance across large-scale distributed systems. The Inference Lead will collaborate closely with research, product, and platform teams to design and deliver cutting-edge solutions that power critical ranking and recommendation services. Responsibilities Lead and mentor a team of inference-focused Machine Learning Engineers, setting technical direction and ensuring best practices. Drive the design and evolution of distributed inference infrastructure to support feeds, ads, search, and other core ranking models. Oversee the development of monitoring, observability, and management tools to ensure reliability and scalability of online inference services. Identify and resolve system inefficiencies, performance bottlenecks, and reliability issues, ensuring optimized end-to-end performance. Partner with research and product teams to translate requirements into robust and efficient inference solutions. Stay at the forefront of advancements in inference frameworks, ML hardware acceleration, and distributed systems, incorporating innovations where impactful.

Qualifications

Minimum Qualifications

- Bachelors degree or above in Computer Science, Electrical Engineering, or related field. 5+ years of experience in developing and deploying large-scale, distributed systems, with at least 5 years in a leadership or technical lead role. Strong programming skills in languages such as C++, Python, or Go. Deep understanding of inference frameworks and ML system deployment (e.g., TensorFlow, PyTorch, TensorRT, JAX, MXNet). Proven experience optimizing performance for large-scale machine learning systems, including hardware-software co-design, GPU/RDMA acceleration, or HPC techniques. Excellent communication and collaboration skills; ability to work across research, engineering, and product teams. Preferred Qualifications

- Experience leading teams working on high-throughput, low-latency ML serving systems. Contributions to open-source ML or systems projects. Familiarity with container orchestration, service mesh, or cloud-native ML infrastructure. Experience collaborating with and leading global, cross-functional teams across different time zones.

Job Information

About Us Founded in 2012, ByteDance's mission is to inspire creativity and enrich life. With a suite of more than a dozen products, including TikTok, Lemon8, CapCut and Pico as well as platforms specific to the China market, including Toutiao, Douyin, and Xigua, ByteDance has made it easier and more fun for people to connect with, consume, and create content. Why Join ByteDance Inspiring creativity is at the core of ByteDance's mission. Our innovative products are built to help people authentically express themselves, discover and connect and our global, diverse teams make that possible. Together, we create value for our communities, inspire creativity and enrich life - a mission we work towards every day. As ByteDancers, we strive to do great things with great people. We lead with curiosity, humility, and a desire to make impact in a rapidly growing tech company. By constantly iterating and fostering an "Always Day 1" mindset, we achieve meaningful breakthroughs for ourselves, our Company, and our users. When we create and grow together, the possibilities are limitless. Join us. Diversity & Inclusion ByteDance is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workplace. At ByteDance, our mission is to inspire creativity and enrich life. To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach. We are passionate about this and hope you are too. Reasonable Accommodation ByteDance is committed to providing reasonable accommodations in our recruitment processes for candidates with disabilities, pregnancy, sincerely held religious beliefs or other reasons protected by applicable laws. If you need assistance or a reasonable accommodation, please reach out to us at https://tinyurl.com/RA-request #J-18808-Ljbffr