Amadeus Search

Member of the Technical Staff- LLMs

Amadeus Search, San Francisco, California, United States, 94199

Member of Technical Staff – Infrastructure & LLMs Location:

San Francisco, CA (Hybrid)

Compensation:

$170,000 – $220,000 base + 1–3% equity

Work Authorization:

U.S. work authorization required (no visa sponsorship)

Start Date:

ASAP

Type:

Full-time

About the Role We’re seeking a deeply curious and technically strong engineer to join a lean, high-performance team building next-generation inference infrastructure for LLMs. This is an opportunity to own the design and development of performance-critical systems from day one, working directly on problems like:

Scaling multi-GPU inference workloads

Designing distributed job schedulers

Experimenting with LLM distillation and optimization frameworks

You’ll join a two-person engineering team at the earliest stage, where your impact will be foundational to both product and culture. No bureaucracy. No politics. Just ambitious, technically challenging work that matters.

Why This Role is Unique

Massive Technical Ownership:

Drive core infra design with zero red tape.

Frontier Engineering:

Work on distributed systems, LLM runtimes, CUDA orchestration, and novel scaling solutions.

Foundational Equity:

Earn meaningful ownership and grow into a founding-level role.

Mission-Driven:

Focused on durable infra, not short-term hype cycles.

No Credentials Needed:

We value ability and drive over resumes and degrees.

Ideal Candidate Profile

2+ years experience in backend or infrastructure engineering

Deep interest or experience in distributed systems, GPU orchestration, or AI infra

Strong technical curiosity demonstrated through side projects, OSS contributions, or community involvement

Background at infra-focused orgs (e.g., Supabase, Dagster, Modal, Lightning AI, MotherDuck)

Python fluency, with production experience in Docker, GPU workloads, and distributed compute systems

Tech Stack

Core Language:

Python

Infrastructure:

Custom distributed systems for multi-GPU inference

Deployment:

Docker, CUDA, Kubernetes (or equivalent)

Focus:

Batch inference, model distillation, low‑latency pipelines

Soft Traits

Fast learner with ownership mindset

Thinks from first principles, skeptical of default assumptions

Collaborative, positive‑sum team player

Oriented toward building, not credentialism

#J-18808-Ljbffr