Logo
Sciforium

Machine Learning Engineer

Sciforium, San Francisco, California, United States

Save Job

Sciforium is an AI infrastructure company developing next-generation multimodal AI models and a proprietary, high-efficiency serving platform. Backed by multi-million-dollar funding and direct sponsorship from AMD with hands‑on support from AMD engineers the team is scaling rapidly to build the full stack powering frontier AI models and real‑time applications.

About the Role As a

Research Engineer , you’ll work across the full foundation-model stack:

pretraining and scaling ,

post-training and Reinforcement Learning ,

sandbox environments for evaluation and agentic learning , and

deployment + inference optimization . You’ll build and iterate quickly on research ideas, contribute production‑grade infrastructure, and help deliver models that can serve real‑world use cases at scale.

What you’ll work on This role spans multiple tracks - candidates may focus on one or contribute across several. Examples include:

Pretraining & Scaling

Train large byte-native foundation models across massive, heterogeneous corpora

Design stable training recipes and scaling laws for novel architectures

Improve throughput, memory efficiency, and utilization on large GPU clusters

Build and maintain distributed training infrastructure and fault‑tolerant pipelines

Post-training & RL

Develop post-training pipelines (SFT, preference optimization, RLHF/RLAIF, RL)

Curate and generate targeted datasets to improve specific model capabilities

Build reward models and evaluation frameworks to drive iterative improvement

Explore inference‑time learning and compute techniques to enhance performance

Sandbox Environments & Evaluation

Build scalable sandbox environments for agent evaluation and learning

Create realistic, high‑signal automated evals for reasoning, tool use, and safety

Design offline + online environments that support RL‑style training at scale

Instrument environments for observability, reproducibility, and iteration speed

Deployment & Inference Optimization

Optimize inference throughput/latency for byte‑native architectures

Build high‑performance serving pipelines (KV caching, batching, quantization, etc.)

Improve end‑to‑end model efficiency, cost, and reliability in production

Profile and optimize GPU kernels, runtime bottlenecks, and memory behavior

Ideal candidate credentials Technical strength

Strong general software engineering skills (writing robust, performant systems)

Experience with training or serving large neural networks (LLMs or similar)

Solid grasp of deep learning fundamentals and modern literature

Comfort working in high‑performance environments (GPU, distributed systems, etc.)

Relevant experience (one or more)

Pretraining / large‑scale distributed training (FSDP/ZeRO/Megatron‑style systems)

Post‑training pipelines (SFT, RLHF/RLAIF, preference optimization, eval loops)

Building RL environments, simulators, or agent frameworks

Inference optimization, model compression, quantization, kernel‑level profiling

Building large ETL pipelines for internet‑scale data ingestion and cleaning

Owning end‑to‑end production ML systems with monitoring and reliability

Research orientation

Ability to propose and evaluate research ideas quickly

Strong experimental hygiene: ablations, metrics, reproducibility, analysis

Bias toward building — you can turn ideas into working code and results

Benefits include

Medical, dental, and vision insurance

401k plan

Daily lunch, snacks, and beverages

Flexible time off

Competitive salary and equity

Equal opportunity Sciforium is an equal opportunity employer. All applicants will be considered for employment without attention to race, color, religion, sex, sexual orientation, gender identity, national origin, veteran or disability status.

#J-18808-Ljbffr