General Intelligence Company

Applied AI Engineer - Agent

General Intelligence Company, New York, New York, us, 10261

We’re hiring an Applied AI Engineer to push the boundaries of our Cofounder agent. You’ll own core backend systems and applied LLM work: advancing agent reliability and autonomy, building evaluation pipelines, and shipping techniques that measurably improve agent performance. This is a hands‑on role with high ownership across research‑to‑production: prototyping, instrumenting, evaluating, and deploying improvements that show up directly in user outcomes. What You’ll Do

Design and implement agent improvements end‑to‑end: prompting strategies, tool selection, action planning, memory usage, safety/guardrails, and recovery paths

Build robust evaluation pipelines for the agent: offline evals (golden tasks, regression suites, behavior tests), online metrics (latency, success rate, fallout modes, cost efficiency), and experimentation frameworks (A/B, canaries, guardrail thresholds)

Productionize applied LLM techniques: function/tool‑calling orchestration, self‑reflection, retrieval/RAG, multi‑agent handoffs, caching/embedding strategies, and hallucination reduction

Improve core backend systems: reliable job orchestration, retries/backoff, idempotency, and auditability; scalable memory and context routing; data pipelines across Gmail, Slack, Notion, Linear, Google Workspace, etc.; observability and tracing for agent actions/outcomes

Partner with product and infra to define success metrics and ship fast, safe iterations

Write clean, well‑tested code; document design decisions and runbooks

What You’ll Bring

4+ years backend engineering experience, preferably Python (we care about impact over years)

Hands‑on LLM experience: prompt engineering, function‑calling, retrieval, embeddings, evaluation design; you’ve shipped LLM features to production

Track record building evaluation harnesses and using them to drive improvements (regression suites, task success metrics, cost/runtime tradeoffs)

Solid distributed systems fundamentals: concurrency, reliability, performance, data modeling, lifecycle management

Pragmatic experimentation: hypothesis → prototype → measured improvement → rollout

Excellent debugging and instrumentation skills; you enjoy finding and fixing edge cases in the wild

Nice To Have

Experience with agent frameworks, tool orchestration, and memory architectures

RAG systems in production (chunking, retrieval quality, freshness strategies)

Redis, Postgres/Supabase, queues (e.g., Celery/Arq/SQS), and event‑driven designs

Observability stacks (Datadog, OpenTelemetry), and cost/latency optimization

Why Join Us

Mission: build autonomous agents that run entire businesses

Impact: ship core agent improvements that users feel immediately

Velocity: small, senior team; fast decision cycles; high ownership

Stack: modern tooling across AI orchestration, integrations, and memory systems

Compensation

Competitive salary and meaningful equity

Comprehensive benefits and flexible work setup

#J-18808-Ljbffr