Jobright.ai
Software Engineer, Performance Optimization, Mid-Level
Jobright.ai, Redwood City, California, United States, 94061
Software Engineer, Performance Optimization, Mid-Level
Join to apply for the
Software Engineer, Performance Optimization, Mid-Level
role at
Jobright.ai Software Engineer, Performance Optimization, Mid-Level
2 days ago Be among the first 25 applicants Join to apply for the
Software Engineer, Performance Optimization, Mid-Level
role at
Jobright.ai Jobright is an AI-powered career platform that helps job seekers discover the top opportunities in the US. We are NOT a staffing agency. Jobright does not hire directly for these positions. We connect you with verified openings from employers you can trust. Job Summary: Fireworks AI is an artificial intelligence inference platform for developing and deploying generative AI systems. They are seeking a Software Engineer focused on Performance Optimization to enhance the speed and efficiency of their AI infrastructure, working on optimizing performance across various layers of the stack. This role involves close collaboration with teams to identify and address performance bottlenecks, directly impacting the performance of advanced generative AI models. Responsibilities: • Optimize system and GPU performance for high-throughput AI workloads across training and inference • Analyze and improve latency, throughput, memory usage, and compute efficiency • Profile system performance to detect and resolve GPU- and kernel-level bottlenecks • Implement low-level optimizations using CUDA, Triton, and other performance tooling • Drive improvements in execution speed and resource utilization for large-scale model workloads (LLMs, VLMs, and video models) • Collaborate with ML researchers to co-design and tune model architectures for hardware efficiency • Improve support for mixed precision, quantization, and model graph optimization • Build and maintain performance benchmarking and monitoring infrastructure • Scale inference and training systems across multi-GPU, multi-node environments • Evaluate and integrate optimizations for emerging hardware accelerators and specialized runtimes Qualifications: Required: • Bachelor’s degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent practical experience • 5+ years of experience working on performance optimization or high-performance computing systems • Proficiency in CUDA or ROCm and experience with GPU profiling tools (e.g., Nsight, nvprof, CUPTI) • Familiarity with PyTorch and performance-critical model execution • Experience with distributed system debugging and optimization in multi-GPU environments • Deep understanding of GPU architecture, parallel programming models, and compute kernels Preferred: • Master’s or PhD in Computer Science, Electrical Engineering, or a related field • Experience optimizing large models for training and inference (LLMs, VLMs, or video models) • Knowledge of compiler stacks or ML compilers (e.g., torch.compile, Triton, XLA) • Contributions to open-source ML or HPC infrastructure • Familiarity with cloud-scale AI infrastructure and orchestration tools (e.g., Kubernetes, Ray) • Background in ML systems engineering or hardware-aware model design Company: Fireworks AI is an artificial intelligence inference platform for developing and deploying generative AI systems. Founded in 2022, the company is headquartered in Redwood City, California, USA, with a team of 51-200 employees. The company is currently Growth Stage. Fireworks AI has a track record of offering H1B sponsorships. Seniority level
Seniority level Mid-Senior level Employment type
Employment type Full-time Job function
Industries Software Development Referrals increase your chances of interviewing at Jobright.ai by 2x Inferred from the description for this job
Medical insurance Vision insurance 401(k) Get notified about new Software Engineer jobs in
Redwood City, CA . Mountain View, CA $145,000.00-$170,000.00 1 week ago Mountain View, CA $138,225.00-$207,575.00 1 week ago Software Engineer, AI Platform - New Grad
Mountain View, CA $145,000.00-$170,000.00 1 week ago Graduate Software Engineer (Redwood City, CA)
San Francisco, CA $57.00-$61.00 3 days ago San Francisco, CA $57.00-$61.00 3 days ago Software Engineer, AI Intern (Winter 2026)
San Francisco, CA $57.00-$61.00 3 days ago Software Engineer, AI Intern (Summer 2026)
San Francisco, CA $57.00-$61.00 3 days ago Software Engineer (L4), Content & Business Products
Software Engineer, Frontend (All Levels)
Mountain View, CA $130,000.00-$176,000.00 4 days ago San Jose, CA $100,500.00-$173,250.00 3 days ago San Francisco, CA $150,000.00-$180,000.00 6 days ago San Francisco, CA $125,000.00-$218,900.00 6 days ago San Jose, CA $113,400.00-$206,300.00 1 week ago San Francisco, CA $255,000.00-$405,000.00 3 days ago San Jose, CA $113,400.00-$206,300.00 1 week ago Software Engineer - New Grad (2026 Start)
San Francisco, CA $163,000.00-$225,000.00 11 hours ago We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
#J-18808-Ljbffr
Join to apply for the
Software Engineer, Performance Optimization, Mid-Level
role at
Jobright.ai Software Engineer, Performance Optimization, Mid-Level
2 days ago Be among the first 25 applicants Join to apply for the
Software Engineer, Performance Optimization, Mid-Level
role at
Jobright.ai Jobright is an AI-powered career platform that helps job seekers discover the top opportunities in the US. We are NOT a staffing agency. Jobright does not hire directly for these positions. We connect you with verified openings from employers you can trust. Job Summary: Fireworks AI is an artificial intelligence inference platform for developing and deploying generative AI systems. They are seeking a Software Engineer focused on Performance Optimization to enhance the speed and efficiency of their AI infrastructure, working on optimizing performance across various layers of the stack. This role involves close collaboration with teams to identify and address performance bottlenecks, directly impacting the performance of advanced generative AI models. Responsibilities: • Optimize system and GPU performance for high-throughput AI workloads across training and inference • Analyze and improve latency, throughput, memory usage, and compute efficiency • Profile system performance to detect and resolve GPU- and kernel-level bottlenecks • Implement low-level optimizations using CUDA, Triton, and other performance tooling • Drive improvements in execution speed and resource utilization for large-scale model workloads (LLMs, VLMs, and video models) • Collaborate with ML researchers to co-design and tune model architectures for hardware efficiency • Improve support for mixed precision, quantization, and model graph optimization • Build and maintain performance benchmarking and monitoring infrastructure • Scale inference and training systems across multi-GPU, multi-node environments • Evaluate and integrate optimizations for emerging hardware accelerators and specialized runtimes Qualifications: Required: • Bachelor’s degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent practical experience • 5+ years of experience working on performance optimization or high-performance computing systems • Proficiency in CUDA or ROCm and experience with GPU profiling tools (e.g., Nsight, nvprof, CUPTI) • Familiarity with PyTorch and performance-critical model execution • Experience with distributed system debugging and optimization in multi-GPU environments • Deep understanding of GPU architecture, parallel programming models, and compute kernels Preferred: • Master’s or PhD in Computer Science, Electrical Engineering, or a related field • Experience optimizing large models for training and inference (LLMs, VLMs, or video models) • Knowledge of compiler stacks or ML compilers (e.g., torch.compile, Triton, XLA) • Contributions to open-source ML or HPC infrastructure • Familiarity with cloud-scale AI infrastructure and orchestration tools (e.g., Kubernetes, Ray) • Background in ML systems engineering or hardware-aware model design Company: Fireworks AI is an artificial intelligence inference platform for developing and deploying generative AI systems. Founded in 2022, the company is headquartered in Redwood City, California, USA, with a team of 51-200 employees. The company is currently Growth Stage. Fireworks AI has a track record of offering H1B sponsorships. Seniority level
Seniority level Mid-Senior level Employment type
Employment type Full-time Job function
Industries Software Development Referrals increase your chances of interviewing at Jobright.ai by 2x Inferred from the description for this job
Medical insurance Vision insurance 401(k) Get notified about new Software Engineer jobs in
Redwood City, CA . Mountain View, CA $145,000.00-$170,000.00 1 week ago Mountain View, CA $138,225.00-$207,575.00 1 week ago Software Engineer, AI Platform - New Grad
Mountain View, CA $145,000.00-$170,000.00 1 week ago Graduate Software Engineer (Redwood City, CA)
San Francisco, CA $57.00-$61.00 3 days ago San Francisco, CA $57.00-$61.00 3 days ago Software Engineer, AI Intern (Winter 2026)
San Francisco, CA $57.00-$61.00 3 days ago Software Engineer, AI Intern (Summer 2026)
San Francisco, CA $57.00-$61.00 3 days ago Software Engineer (L4), Content & Business Products
Software Engineer, Frontend (All Levels)
Mountain View, CA $130,000.00-$176,000.00 4 days ago San Jose, CA $100,500.00-$173,250.00 3 days ago San Francisco, CA $150,000.00-$180,000.00 6 days ago San Francisco, CA $125,000.00-$218,900.00 6 days ago San Jose, CA $113,400.00-$206,300.00 1 week ago San Francisco, CA $255,000.00-$405,000.00 3 days ago San Jose, CA $113,400.00-$206,300.00 1 week ago Software Engineer - New Grad (2026 Start)
San Francisco, CA $163,000.00-$225,000.00 11 hours ago We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
#J-18808-Ljbffr