Perplexity AI Inc.

Software Engineer - Data Platform | NYC, Seattle, SF

Perplexity AI Inc., Seattle, Washington, us, 98127

About Perplexity Perplexity is an AI-powered answer engine founded in December 2022 and growing rapidly as one of the world’s leading AI platforms. Perplexity has raised over $1B in venture investment from some of the world’s most visionary and successful leaders, including Elad Gil, Daniel Gross, Jeff Bezos, Accel, IVP, NEA, NVIDIA, Samsung, and many more. Our objective is to build accurate, trustworthy AI that powers decision‑making for people and assistive AI wherever decisions are being made. Throughout human history, change and innovation have always been driven by curious people. Today, curious people use Perplexity to answer more than 780 million queries every month–a number that’s growing rapidly for one simple reason: everyone can be curious.

About the Role Perplexity is looking for experienced

Data Platform Engineers

to design, build, and scale the foundational data systems that power our product, AI research, analytics, and decision‑making at scale.

In this role, you will develop and own critical infrastructure for

batch and streaming data processing ,

data orchestration ,

reliability , and

developer experience

across the data stack. You’ll work closely with engineering and data science teams to ensure data is

accurate, timely, discoverable, and trustworthy , while enabling teams to move fast without sacrificing correctness or scale.

This is a high‑impact, senior/staff‑level role where you will shape architecture, set standards, and drive long‑term technical direction for Perplexity’s data ecosystem.

What You’ll Do Data Platform & Pipelines

Design and operate

large‑scale batch and streaming data pipelines

supporting product features, AI training/evaluation, analytics, and experimentation.

Build and evolve

event‑driven and streaming systems

(e.g., Kafka/Kinesis/PubSub‑style architectures) for real‑time ingestion, transformation, and delivery.

Own

batch processing frameworks

for backfills, aggregations, and offline computation.

Orchestration & Reliability

Lead the design and operation of

data orchestration systems

(e.g.,

Airflow, Dagster , or equivalent), including scheduling, dependency management, retries, SLAs, and observability.

Establish strong guarantees around

data correctness, freshness, lineage, and recoverability .

Design systems that handle scale, partial failure, and evolving schemas.

Platform & Developer Enablement

Build

self‑serve data platforms

that empower engineers, data scientists, and analysts to safely create and operate pipelines.

Improve

developer experience

for data work through better abstractions, tooling, documentation, and paved paths.

Set standards for data modeling, testing, validation, and deployment.

Architecture & Leadership

Drive architectural decisions across data infrastructure for storage, compute, orchestration, and APIs.

Partner closely with engineering and data science teams to align data systems with evolving requirements.

Mentor engineers, review designs, and raise the technical bar across the organization.

What We’re Looking For Minimum Qualifications

5+ years (Senior) or 8+ years (Staff) of software engineering experience.

Strong experience building production data infrastructure systems.

Hands‑on experience with batch and/or streaming data processing at scale.

Deep familiarity with data orchestration systems (Airflow, Dagster, or similar).

Proficiency in Python and at least one additional backend language (Go, TypeScript, etc.).

Strong systems thinking: you understand tradeoffs across reliability, latency, cost, and complexity.

Experience supporting ML/AI workflows, training pipelines, or evaluation systems.

Familiarity with data quality, lineage, observability, and governance tooling.

Prior ownership of internal platforms used by many teams.

#J-18808-Ljbffr