Logo
HireOTS

AI Infrastructure Engineer - PlayerZero

HireOTS, San Francisco, California, United States, 94199

Save Job

A

stealth-stage AI infrastructure company

is building a

self-healing system for software

that automates defect resolution and development. The platform is used by engineering and support teams to:

Autonomously debug problems in production software Fix issues directly in the codebase Prevent recurring issues through intelligent root-cause automation The company is backed by top-tier investors such as

Foundation Capital, WndrCo, and Green Bay Ventures , as well as prominent operators including

Matei Zaharia, Drew Houston, Dylan Field, Guillermo Rauch , and others.

We believe that as software development accelerates, the burden of maintaining quality and reliability shifts heavily onto engineering and support teams. This challenge creates a rare opportunity to

reimagine how software is supported and sustained -with AI-powered systems that respond autonomously. About the Role

We're looking for an experienced

backend/infrastructure engineer

who thrives at the intersection of systems and AI - and who loves turning research prototypes into rock-solid production services. You'll design and scale the core backend that powers our AI inference stack - from ingestion pipelines and feature stores to GPU orchestration and vector search.

If you care deeply about

performance, correctness, observability, and fast iteration , you'll fit right in. What You'll Do

Own mission-critical services

end-to-end - from architecture and design reviews to deployment, observability, and service-level objectives. Scale LLM-driven systems : build RAG pipelines, vector indexes, and evaluation frameworks handling billions of events per day. Design data-heavy backends : streaming ETL, columnar storage, time-series analytics - all fueling the self-healing loop. Optimize for cost and latency

across compute types (CPUs, GPUs, serverless); profile hot paths and squeeze out milliseconds. Drive reliability : implement automated testing, chaos engineering, and progressive rollout strategies for new models. Work cross-functionally

with ML researchers, product engineers, and real customers to build infrastructure that actually matters. You Might Thrive in This Role If You: Have

2-5+ years

of experience building scalable backend or infra systems in production environments Bring a

builder mindset

- you like owning projects end-to-end and thinking deeply about data, scale, and maintainability Have transitioned

ML or data-heavy prototypes to production , balancing speed and robustness Are comfortable with

data engineering workflows : parsing, transforming, indexing, and querying structured or unstructured data Have some exposure to

search infrastructure

or

LLM-backed systems

(e.g., document retrieval, RAG, semantic search) Bonus Points Experience with

vector databases

(e.g., pgvector, Pinecone, Weaviate) or inverted-index search (e.g., Elasticsearch, Lucene) Hands-on with

GPU orchestration

(Kubernetes, Ray, KServe) or model-parallel inference tuning Familiarity with

Go / Rust

(primary stack), with some TypeScript for light full-stack tasks Deep knowledge of

observability tooling

(OpenTelemetry, Grafana, Datadog) and profiling distributed systems Contributions to open-source

ML or systems infrastructure projects

Let me know if you'd like a version optimized for careers pages, job boards, or stealth pitch decks.