Energy Jobline ZR
ML Infrastructure Engineer in Menlo Park
Energy Jobline ZR, Menlo Park, California, United States, 94029
Energy Jobline is the largest and fastest growing global Energy Job Board and Energy Hub. We have an audience reach of over 7 million energy professionals, 400,000+ monthly advertised global energy and engineering jobs, and work with the leading energy companies worldwide.
We focus on the Oil & Gas, Renewables, Engineering, Power, and Nuclear markets as well as emerging technologies in EV, Battery, and Fusion. We are committed to ensuring that we offer the most exciting career opportunities from around the world for our jobseekers.
Job DescriptionJob Description
ML Infrastructure Engineer Menlo Park, CA | On-Site | Full-Time/Direct Hire
Looking for ML Infra experts (Bay Area ) with deep experience in CUDA, GPU optimization, VLLMs, and LLM inference—pure focus, no vision/audio.
Client Opportunity | Through Phizenix
Phizenix, a certified minority and women-led recruiting firm, is hiring on behalf of an AI startup pioneering diffusion-based large models—built for faster , multimodal integration, and scalable enterprise deployment.
We're looking for a
ML Infrastructure Engineer
to help build the infrastructure that powers large-scale model training and real-time inference. You'll collaborate with world-class researchers and engineers to design high-performance, distributed systems that bring advanced LLMs into production.
Responsibilities
Design and manage distributed infrastructure for ML training at scale
Optimize model serving systems for low-latency inference
Build automated pipelines for data processing, model training, and deployment
Implement observability tools to monitor performance in production
Maximize resource utilization across GPU clusters and cloud environments
Translate research requirements into robust, scalable system designs
Must-Haves
Masters or PhD
in Computer Science, Engineering, or a related field (or equivalent experience)
Strong foundation in software engineering, systems design, and distributed systems
Experience with cloud platforms (AWS, GCP, or Azure)
Proficient in Python and at least one systems-level (C++/Rust/Go)
Hands-on experience with Docker, Kubernetes, and CI/CD workflows
Familiarity with ML frameworks like PyTorch or TensorFlow from a systems perspective
Understanding of GPU programming and high-performance infrastructure
Nice-to-Haves
Experience with large-scale ML training clusters and GPU orchestration
Knowledge of LLM-serving tools (vLLM, TensorRT, ONNX Runtime)
Experience with distributed training strategies (e.g., data/model/pipeline parallelism)
Familiarity with orchestration tools like Kubeflow or Airflow
Background in performance tuning, system profiling, and MLOps best practices
At
Phizenix , we're committed to supporting diverse and inclusive teams. This is your chance to shape the systems that power the next of AI innovation. Let's build the future—together.
California Pay Range$180,000—$200,000 USD
If you are interested in applying for this job please press the Apply Button and follow the application process. Energy Jobline wishes you the very best of luck in your next career move.
We focus on the Oil & Gas, Renewables, Engineering, Power, and Nuclear markets as well as emerging technologies in EV, Battery, and Fusion. We are committed to ensuring that we offer the most exciting career opportunities from around the world for our jobseekers.
Job DescriptionJob Description
ML Infrastructure Engineer Menlo Park, CA | On-Site | Full-Time/Direct Hire
Looking for ML Infra experts (Bay Area ) with deep experience in CUDA, GPU optimization, VLLMs, and LLM inference—pure focus, no vision/audio.
Client Opportunity | Through Phizenix
Phizenix, a certified minority and women-led recruiting firm, is hiring on behalf of an AI startup pioneering diffusion-based large models—built for faster , multimodal integration, and scalable enterprise deployment.
We're looking for a
ML Infrastructure Engineer
to help build the infrastructure that powers large-scale model training and real-time inference. You'll collaborate with world-class researchers and engineers to design high-performance, distributed systems that bring advanced LLMs into production.
Responsibilities
Design and manage distributed infrastructure for ML training at scale
Optimize model serving systems for low-latency inference
Build automated pipelines for data processing, model training, and deployment
Implement observability tools to monitor performance in production
Maximize resource utilization across GPU clusters and cloud environments
Translate research requirements into robust, scalable system designs
Must-Haves
Masters or PhD
in Computer Science, Engineering, or a related field (or equivalent experience)
Strong foundation in software engineering, systems design, and distributed systems
Experience with cloud platforms (AWS, GCP, or Azure)
Proficient in Python and at least one systems-level (C++/Rust/Go)
Hands-on experience with Docker, Kubernetes, and CI/CD workflows
Familiarity with ML frameworks like PyTorch or TensorFlow from a systems perspective
Understanding of GPU programming and high-performance infrastructure
Nice-to-Haves
Experience with large-scale ML training clusters and GPU orchestration
Knowledge of LLM-serving tools (vLLM, TensorRT, ONNX Runtime)
Experience with distributed training strategies (e.g., data/model/pipeline parallelism)
Familiarity with orchestration tools like Kubeflow or Airflow
Background in performance tuning, system profiling, and MLOps best practices
At
Phizenix , we're committed to supporting diverse and inclusive teams. This is your chance to shape the systems that power the next of AI innovation. Let's build the future—together.
California Pay Range$180,000—$200,000 USD
If you are interested in applying for this job please press the Apply Button and follow the application process. Energy Jobline wishes you the very best of luck in your next career move.