Sixtyfour

AI Engineer

Sixtyfour, San Francisco, California, United States, 94199

What you’ll do

Design and ship agentic systems (tool calling, multi-agent workflows, structured outputs) that reliably fetch, extract, and normalize data across the web and APIs.

Own robust web scraping: directory crawling, CAPTCHA handling, headless browsers, rotating proxies, anti-bot evasion, and backoff/retry policies.

Develop backend services in

Python + FastAPI

with clean contracts and strong observability.

Scale workloads on

AWS + Docker

(batch/queue workers, autoscaling, fault tolerance, cost control).

Parallelize external API requests safely (rate limits, idempotency, circuit breakers, retries, dedupe).

Integrate third-party APIs for enrichment and search; model and cache responses; manage schema evolution.

Transform and analyze data using

Pandas

(or similar) for normalization, QA, and reporting.

Pitch in across the stack: billing (Stripe), and occasional front-end changes to ship end-to-end features.

Minimum requirements

Hands‑on experience with

agentic architectures

(tool calling, structured outputs/JSON, planning/execution loops) and prompt engineering.

Proven

web scraping

expertise: solving CAPTCHAs, session/auth flows, proxy rotation, stealth techniques, and legal/ethical constraints.

AWS + Docker

in production (at least two of: ECS/EKS, Lambda, SQS/SNS, Batch, Step Functions, CloudWatch).

Building

high-throughput

data/IO pipelines with concurrency (asyncio/multiprocessing), resilient retries, and rate‑limit aware scheduling.

Integrating diverse

external APIs

(auth patterns, pagination, webhooks); designing stable interfaces and backfills.

Strong data wrangling with

Pandas

or equivalent; comfort with large CSV/Parquet workflows and memory/perf tuning.

Excellent ownership, product sense, and pragmatic debugging.

Nice to have

Entity resolution/record linkage at scale (probabilistic matching, blocking, deduping).

Experience with

Langfuse , OpenTelemetry, or similar for tracing/evals; task queues (Celery/RQ), Redis, Postgres.

Search relevance (BM25/vector/hybrid), embeddings, and retrieval pipelines.

Playwright/Selenium, stealth browsers, anti-bot frameworks, CAPTCHA providers.

CI/CD, infrastructure as code (Terraform), and cost/perf observability.

Security & compliance basics for data handling and PII.

#J-18808-Ljbffr