Sixtyfour
Founding Engineer — AI Research Agents (Full-Stack)
Sixtyfour, California, Missouri, United States, 65018
What you’ll do
Design and ship agentic systems (tool calling, multi-agent workflows, structured outputs) that reliably fetch, extract, and normalize data across the web and APIs. Build and operate search/indexing pipelines on OpenSearch/Elasticsearch (schema design, analyzers, reindex/data migration strategies, relevance tuning). Own robust web scraping: directory crawling, CAPTCHA handling, headless browsers, rotating proxies, anti-bot evasion, and backoff/retry policies. Develop backend services in
Python + FastAPI
with clean contracts and strong observability. Scale workloads on
AWS + Docker
(batch/queue workers, autoscaling, fault tolerance, cost control). Parallelize external API requests safely (rate limits, idempotency, circuit breakers, retries, dedupe). Integrate third-party APIs for enrichment and search; model and cache responses; manage schema evolution. Transform and analyze data using
Pandas
(or similar) for normalization, QA, and reporting. Pitch in across the stack: billing (Stripe), and occasional front-end changes to ship end-to-end features. Minimum requirements
Hands-on experience with
agentic architectures
(tool calling, structured outputs/JSON, planning/execution loops) and prompt engineering. Deep knowledge of
OpenSearch/Elasticsearch : index design, analyzers, ingestion pipelines, snapshots, rolling upgrades, and
zero-downtime reindexing/data migrations . Proven
web scraping
expertise: solving CAPTCHAs, session/auth flows, proxy rotation, stealth techniques, and legal/ethical constraints. AWS + Docker
in production (at least two of: ECS/EKS, Lambda, SQS/SNS, Batch, Step Functions, CloudWatch). Building
high-throughput
data/IO pipelines with concurrency (asyncio/multiprocessing), resilient retries, and rate-limit aware scheduling. Integrating diverse
external APIs
(auth patterns, pagination, webhooks); designing stable interfaces and backfills. Strong data wrangling with
Pandas
or equivalent; comfort with large CSV/Parquet workflows and memory/perf tuning. Familiarity with
Stripe
(subscriptions, metered billing, webhooks) and basic front-end changes (React/TypeScript or similar). Excellent ownership, product sense, and pragmatic debugging. Nice to have
Entity resolution/record linkage at scale (probabilistic matching, blocking, deduping). Experience with
Langfuse , OpenTelemetry, or similar for tracing/evals; task queues (Celery/RQ), Redis, Postgres. Search relevance (BM25/vector/hybrid), embeddings, and retrieval pipelines. Playwright/Selenium, stealth browsers, anti-bot frameworks, CAPTCHA providers. CI/CD, infrastructure as code (Terraform), and cost/perf observability. Security & compliance basics for data handling and PII.
#J-18808-Ljbffr
Design and ship agentic systems (tool calling, multi-agent workflows, structured outputs) that reliably fetch, extract, and normalize data across the web and APIs. Build and operate search/indexing pipelines on OpenSearch/Elasticsearch (schema design, analyzers, reindex/data migration strategies, relevance tuning). Own robust web scraping: directory crawling, CAPTCHA handling, headless browsers, rotating proxies, anti-bot evasion, and backoff/retry policies. Develop backend services in
Python + FastAPI
with clean contracts and strong observability. Scale workloads on
AWS + Docker
(batch/queue workers, autoscaling, fault tolerance, cost control). Parallelize external API requests safely (rate limits, idempotency, circuit breakers, retries, dedupe). Integrate third-party APIs for enrichment and search; model and cache responses; manage schema evolution. Transform and analyze data using
Pandas
(or similar) for normalization, QA, and reporting. Pitch in across the stack: billing (Stripe), and occasional front-end changes to ship end-to-end features. Minimum requirements
Hands-on experience with
agentic architectures
(tool calling, structured outputs/JSON, planning/execution loops) and prompt engineering. Deep knowledge of
OpenSearch/Elasticsearch : index design, analyzers, ingestion pipelines, snapshots, rolling upgrades, and
zero-downtime reindexing/data migrations . Proven
web scraping
expertise: solving CAPTCHAs, session/auth flows, proxy rotation, stealth techniques, and legal/ethical constraints. AWS + Docker
in production (at least two of: ECS/EKS, Lambda, SQS/SNS, Batch, Step Functions, CloudWatch). Building
high-throughput
data/IO pipelines with concurrency (asyncio/multiprocessing), resilient retries, and rate-limit aware scheduling. Integrating diverse
external APIs
(auth patterns, pagination, webhooks); designing stable interfaces and backfills. Strong data wrangling with
Pandas
or equivalent; comfort with large CSV/Parquet workflows and memory/perf tuning. Familiarity with
Stripe
(subscriptions, metered billing, webhooks) and basic front-end changes (React/TypeScript or similar). Excellent ownership, product sense, and pragmatic debugging. Nice to have
Entity resolution/record linkage at scale (probabilistic matching, blocking, deduping). Experience with
Langfuse , OpenTelemetry, or similar for tracing/evals; task queues (Celery/RQ), Redis, Postgres. Search relevance (BM25/vector/hybrid), embeddings, and retrieval pipelines. Playwright/Selenium, stealth browsers, anti-bot frameworks, CAPTCHA providers. CI/CD, infrastructure as code (Terraform), and cost/perf observability. Security & compliance basics for data handling and PII.
#J-18808-Ljbffr