Logo
Goliath Partners

Machine Learning Researcher

Goliath Partners, San Francisco, California, United States, 94199

Save Job

A well-funded AI startup (Series A, backed by top-tier investors and led by ex-DeepMind / Anthropic / OpenAI engineers) is building next-generation agentic systems — intelligent, autonomous software agents that can reason, plan, and act across browsers, operating systems, and enterprise environments.

The Role As a Research Engineer on the AI Architecture team, you will design, prototype, and rigorously evaluate novel model architectures and training strategies that push the boundaries of efficiency, scaling, and model capability. Your work will directly influence the organization’s next-generation pretraining runs, and you’ll collaborate closely with the pretraining and systems teams to productionize your research.

This role is ideal for someone who thrives in fast-paced research environments, has strong intuition for promising ideas, and enjoys taking concepts from sketch → prototype → thorough experimental validation.

What You’ll Do

Research, design, and test new model architectures and training methods aimed at improving loss-per-FLOP, loss-per-parameter, and overall modeling efficiency

Identify and solve bottlenecks in contemporary architectures

Rapidly prototype and iterate on ideas, running rigorous experiments, ablations, and hypothesis tests

Collaborate closely with pretraining engineers to integrate successful approaches into large-scale training pipelines

Work in a highly collaborative research environment where strong taste, curiosity, and creativity are valued

What We’re Looking For

Strong research intuition and the ability to take a project from concept → experimentation → write-up

Ability to prototype quickly and operate independently in a fast-moving research environment

Curiosity, creativity, and a genuine interest in understanding intelligence

Excellent collaboration skills in high-velocity research teams

Qualifications

Research experience designing or analyzing novel architectures (e.g., state-space models, diffusion models, MoEs, long-context models)

Experience with long-term memory systems, retrieval/RAG, dynamic or adaptive computation, or alternative credit‑assignment methods

Background with reinforcement learning, control theory, or signal processing

Demonstrated comfort exploring unconventional or “crazy” ideas and evaluating them rigorously

Understanding of large‑scale training pipelines and GPU hardware constraints

Strong experimental methodology (ablations, controls, statistical rigor)

High proficiency in PyTorch and Python

Ability to navigate and contribute to large, complex codebases

Previously published ML research in reputable venues (NeurIPS, ICML, ICLR, CVPR, etc.)

Postgraduate degree in CS, EE/EECS, Math, Physics, or related scientific field

Total comp: 500,000-1,000,000 (with equity, base of 200-300k)

#J-18808-Ljbffr