Logo
HireOTS

AI Infrastructure Engineer - PlayerZero

HireOTS, San Francisco

Save Job

A stealth-stage AI infrastructure company is building a self-healing system for software that automates defect resolution and development. The platform is used by engineering and support teams to:

  • Autonomously debug problems in production software

  • Fix issues directly in the codebase

  • Prevent recurring issues through intelligent root-cause automation

The company is backed by top-tier investors such as Foundation Capital, WndrCo, and Green Bay Ventures , as well as prominent operators including Matei Zaharia, Drew Houston, Dylan Field, Guillermo Rauch , and others.

We believe that as software development accelerates, the burden of maintaining quality and reliability shifts heavily onto engineering and support teams. This challenge creates a rare opportunity to reimagine how software is supported and sustained —with AI-powered systems that respond autonomously.

About the Role

We’re looking for an experienced backend/infrastructure engineer who thrives at the intersection of systems and AI — and who loves turning research prototypes into rock-solid production services. You’ll design and scale the core backend that powers our AI inference stack — from ingestion pipelines and feature stores to GPU orchestration and vector search.

If you care deeply about performance, correctness, observability, and fast iteration , you’ll fit right in.

What You’ll Do

  • Own mission-critical services end-to-end — from architecture and design reviews to deployment, observability, and service-level objectives.

  • Scale LLM-driven systems : build RAG pipelines, vector indexes, and evaluation frameworks handling billions of events per day.

  • Design data-heavy backends : streaming ETL, columnar storage, time-series analytics — all fueling the self-healing loop.

  • Optimize for cost and latency across compute types (CPUs, GPUs, serverless); profile hot paths and squeeze out milliseconds.

  • Drive reliability : implement automated testing, chaos engineering, and progressive rollout strategies for new models.

  • Work cross-functionally with ML researchers, product engineers, and real customers to build infrastructure that actually matters.

You Might Thrive in This Role If You:

  • Have 2–5+ years of experience building scalable backend or infra systems in production environments

  • Bring a builder mindset — you like owning projects end-to-end and thinking deeply about data, scale, and maintainability

  • Have transitioned ML or data-heavy prototypes to production , balancing speed and robustness

  • Are comfortable with data engineering workflows : parsing, transforming, indexing, and querying structured or unstructured data

  • Have some exposure to search infrastructure or LLM-backed systems (e.g., document retrieval, RAG, semantic search)

Bonus Points

  • Experience with vector databases (e.g., pgvector, Pinecone, Weaviate) or inverted-index search (e.g., Elasticsearch, Lucene)

  • Hands-on with GPU orchestration (Kubernetes, Ray, KServe) or model-parallel inference tuning

  • Familiarity with Go / Rust (primary stack), with some TypeScript for light full-stack tasks

  • Deep knowledge of observability tooling (OpenTelemetry, Grafana, Datadog) and profiling distributed systems

  • Contributions to open-source ML or systems infrastructure projects

Let me know if you’d like a version optimized for careers pages, job boards, or stealth pitch decks.

#J-18808-Ljbffr