Logo
ByteDance

Research Engineer Graduate (Seed-Infra-Platform-US) - 2026 Start (PhD)

ByteDance, Seattle, Washington, us, 98127

Save Job

Overview

2 weeks ago Be among the first 25 applicants Responsibilities The Seed-Infra team combines ML system engineering and the art of machine learning to develop and maintain massively distributed ML training and Inference system/services around the world, providing high-performance, highly reliable, scalable systems for LLM/AIGC/AGI. In our team, you\'ll have the opportunity to build the large-scale heterogeneous system integrating with GPU/NPU/RDMA/Storage and keep it running stable and reliable, enrich your expertise in coding, performance analysis and distributed system, and be involved in the decision-making process. You\'ll also be part of a global team with members from the United States, China and Singapore working collaboratively towards unified project direction. The Machine Learning Platform team, a sub-team of Seed Infra, provides end-to-end capabilities across data management, model management, evaluation, serving, and lifecycle management. With a long-term vision in the large model space, the team is dedicated to building world-class infrastructure for large-scale AI platforms. We are looking for talented individuals to join our team in 2026. As a graduate, you will get opportunities to pursue bold ideas, tackle complex challenges, and unlock limitless growth. Launch your career where inspiration is infinite at ByteDance. Successful candidates must be able to commit to an onboarding date by end of year 2026. Please state your availability and graduation date clearly in your resume. The Main Work Directions Include: Responsible for the design and development of the architecture of large-scale machine learning systems, solving technical difficulties such as high concurrency, high reliability, and high scalability of the system. Covering various sub-directions of machine learning system, including resource scheduling, model training, model inference, data management, and workflow orchestration. Responsible for the research and introduction of advanced technologies in machine learning systems, such as the latest hardware architecture, heterogeneous computing systems, and compiler-based optimization technologies. Working closely with the algorithm teams to optimize the algorithm and system jointly. Qualifications Minimum Qualifications: Final year or recent PhD graduate with a background in Computer Science, related technical field or equivalent industrial research experience Must obtain work authorization in the country of employment at the time of hire, and maintain ongoing work authorization during employment. Excellent coding ability, solid foundation in data structures and basic algorithms, proficient in C/C++ or Python, winners of ACM/ICPC, NOI/IOI and other competitions are preferred. Familiar with at least one mainstream machine learning framework (TensorFlow/PyTorch/Jax). Master the principles of distributed systems, and participated in the design, development, and maintenance of large-scale distributed systems. Strong sense of responsibility, good learning ability, communication ability, and self-motivation. Good communication and collaboration skills, able to explore new technologies with the team and promote technological progress. Preferred Qualifications: Prior experience in large-scale projects or papers with great influence in the field of large models. Familiar with NLP, CV-related algorithms, and technologies, and experienced in large model training and RL algorithms. Experience in one of the following fields: CUDA, RDMA, AI Infrastructure, HW/SW Co-Design, High-Performance Computing (cutlass, NCCL), ML Hardware Architecture (GPU, Accelerators, Networking), ML for System, and Distributed Storage. Demonstrated a related technical experience from previous internship, work experience, coding competitions, or publications Curiosity towards new technologies and entrepreneurship High levels of creativity and quick problem-solving capabilities By submitting an application for this role, you accept and agree to our global applicant privacy policy. Job Information Compensation details vary by location and are described in the job posting. Benefits may vary depending on the nature of employment and the country work location. Employees have day one access to medical, dental, and vision insurance, a 401(k) savings plan with company match, paid parental leave, disability coverage, life insurance, wellbeing benefits, and paid time off. The Company reserves the right to modify or change these benefits programs at any time, with or without notice. For Los Angeles County candidates: Qualified applicants with arrest or conviction records will be considered in accordance with applicable laws. Our company conducts a careful assessment of criminal history against the role duties. About Doubao (Seed) Founded in 2023, the ByteDance Doubao (Seed) Team, is dedicated to pioneering advanced AI foundation models. Our goal is to lead in cutting-edge research and drive technological and societal advancements. Our research areas span deep learning, reinforcement learning, Language, Vision, Audio, AI Infra and AI Safety with labs and research positions across China, Singapore, and the US. Why Join ByteDance Inspiring creativity is at the core of ByteDance\'s mission. Our products help people express themselves and connect. Our diverse teams make that possible. We foster curiosity, humility, and impact, with an "Always Day 1" mindset to achieve meaningful breakthroughs for our company and users. Join us. Diversity & Inclusion ByteDance is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. We celebrate diverse voices and strive for an environment that reflects the communities we reach. Reasonable Accommodation ByteDance provides reasonable accommodations in our recruitment processes for candidates with disabilities, pregnancy, sincerely held religious beliefs, or other protected reasons. For assistance, please reach out to us at the provided accommodation request link.

#J-18808-Ljbffr