Get new jobs for this search by email

Create Job Alerts

Software Engineer, Accelerator Systems & Technologies
Meta Platforms - Menlo Park, California, United States, 94029 17 hours ago
Software Engineer, Accelerator Systems & TechnologiesMeta is seeking an experienced software engineer to join our Accelerator Solutions & Te...
More...
Software Engineer, Accelerator Systems & Technologies
META - New York, New York, us, 10261 2 days ago
Summary: Meta is seeking an experienced software engineer to join our Accelerator Solutions & Technologies group, supporting the development of Meta's...
More...
Machine Learning Engineer
ShyftLabs - Atlanta, Georgia, United States, 30305 21 hours ago
Machine Learning EngineerShyftLabs is seeking an experienced Machine Learning Engineer to join our growing team in Atlanta. You will be resp...
More...
LLM Inference Frameworks and Optimization EngineerSan Francisco, ...
Together AI - San Francisco, California, United States, 94102 21 hours ago
Inference Frameworks And Optimization EngineerAt Together.ai, we are building state-of-the-art infrastructure to enable efficient and scalab...
More...
Senior Software Engineer, AWS Neuron Frameworks
Amazon - Seattle, Washington, us, 98127 3 days ago
Join the dynamic team behind the AWS Neuron Frameworks, which is the comprehensive software stack for AWS Inferentia and Trainium machine learning acc...
More...
Machine Learning Engineer, Principal - Model Factory
d-Matrix - New York, New York, United States 5 days ago
Atd-Matrix , we are focused on unleashing the potential of generative AI to power the transformation of technology. We are at the forefront ...
More...
AI Junior Software Engineer
ICDN TELECOM PTE. LTD. - West Islip, New York, United States, 11795 3 days ago
Responsibilities: As an AI Junior Software Engineer, you will be a key contributor to our innovative projects, specifically focusing on the intersecti...
More...
LLM Training Frameworks and Optimization EngineerSan Francisco
Together AI - San Francisco, California, United States, 94102 21 hours ago
LLM Training Frameworks And Optimization EngineerAt Together.ai, we are building cutting-edge infrastructure to enable efficient and scalabl...
More...
LLM Training Dataset Optimization Engineer, Mid-Level
Jobright.ai - San Francisco, California, United States, 94199 5 hours ago
LLM Training Dataset Optimization Engineer, Mid-LevelJoin to apply for theLLM Training Dataset Optimization Engineer, Mid-Level
More...

Go to next page

Meta Platforms

Software Engineer, Accelerator Systems & Technologies

Meta Platforms - Menlo Park, California, United States, 94029

Work at Meta Platforms

Overview
View job

Overview

Software Engineer, Accelerator Systems & Technologies

Meta is seeking an experienced software engineer to join our Accelerator Solutions & Technologies group, supporting the development of Meta's accelerators collective communications software library and optimizing distributed AI/ML workloads' performance. This is an opportunity to work with a highly skilled engineering team, collaborating with a large set of cross-functional and international partners. Meta's next-generation, super-cluster AI/ML platforms offer the opportunity to work in an extremely dynamic environment, enabling core technologies deployed in some of the world's largest scale clusters. Responsibilities

Understand and contribute to the collective communications library, intended to be deployed on Meta's AI/ML superclusters Design and implement communication features for next generation AI/ML workloads Support networking and compute hardware acceleration techniques to improve ML inference and training model performance Support large-scale deployment of collective communication libraries across Meta's infrastructure Perform architectural analysis to ensure system designs meet performance, scalability, and reliability requirements Analyze simulation results to guide firmware development and optimization efforts Minimum Qualifications

Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience Masters or PhD in Computer Science, Computer Engineering, or any other relevant technical field 6+ years experience in developing C++ codebase Understanding of performance, benchmarking measurements, and optimization of collective communication and distributed at-scale model training Preferred Qualifications

Understanding of the transport stack (e.g., RoCE), its constraints and performance measures and how transport considerations enable the collective communications stack. Knowledge of AI/HPC hardware requirements and specifications (e.g., configuring hardware components, GPU, memory, network for AI/HPC workloads). Full-stack experience and understanding of AI/HPC systems, from HW/infrastructure through the application layer, performance optimizations, including familiarity with relevant tools, libraries, and frameworks (e.g., NCCL, PyTorch, CUDA). Experience in one or more of the following machine learning domains: hardware accelerators, AI Infrastructure, and/or high performance computing (HPC), particularly pertaining to interconnect and collective. $85.10/hour to $251,000/year + bonus + equity + benefits Individual compensation is determined by skills, qualifications, experience, and location. Compensation details listed in this posting reflect the base hourly rate, monthly rate, or annual salary only, and do not include bonus, equity or sales incentives, if applicable. In addition to base compensation, Meta offers benefits. Learn more about benefits at Meta. Meta is proud to be an Equal Employment Opportunity employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, reproductive health decisions, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, genetic information, political views or activity, or other applicable legally protected characteristics.

See details and apply

Software Engineer, Accelerator Systems & Technologies

Get new jobs for this search by email

Software Engineer, Accelerator Systems & Technologies

Software Engineer, Accelerator Systems & Technologies

Machine Learning Engineer

LLM Inference Frameworks and Optimization EngineerSan Francisco, ...

Senior Software Engineer, AWS Neuron Frameworks

Machine Learning Engineer, Principal - Model Factory

AI Junior Software Engineer

LLM Training Frameworks and Optimization EngineerSan Francisco

LLM Training Dataset Optimization Engineer, Mid-Level

Overview

See details and apply