The New York Times
Principal Software Engineer, Data Platform
New York, NY
Mission New York Times seeks the truth and helps people understand the world. Journalism is at our core, with a global newsroom and a strategy to make journalism worth paying for.
About the Role
We are seeking a Principal Software Engineer to lead the architecture and evolution of our data and machine learning infrastructure. This role will shape the foundation on which data‑driven products, analytics, and AI applications are built. You will design systems that enable large‑scale data processing, reliable pipelines, and efficient machine learning development—from feature engineering to real‑time model serving.
As a principal engineer, you will partner with product, data science, and platform teams to set technical direction, drive adoption of reusable frameworks, and mentor engineers across the organization. You will ensure that both data and ML platforms are scalable, reliable, cost‑efficient, and compliant with privacy and governance standards.
Key platform technologies include an AWS S3 data lake with Apache Iceberg, Confluent Kafka for real‑time streaming, Fivetran for ETL, Apache Flink for stream processing, AWS Glue (Spark) for ETL, dbt/Athena for analytical models, DynamoDB for low‑latency stores, and Google BigQuery for analytics.
This is a hybrid role based at our New York City headquarters, reporting to the Sr. Director of Engineering. Office attendance: 2+ days per week.
Responsibilities
Design and evolve infrastructure for data ingestion, storage, batch and streaming pipelines, and machine learning workflows.
Build systems for training, deploying, monitoring, and governing models, including feature stores, registries, and inference platforms.
Ensure end‑to‑end system reliability, monitoring, and cost transparency across data and ML workloads.
Deliver frameworks and APIs that enable engineers, analysts, and ML scientists to build and operate solutions independently.
Evaluate and introduce emerging technologies (vector databases, distributed training, orchestration frameworks, LLM stacks) and establish adoption guidelines.
Partner with platform, product, and engineering and ML science leaders to align on strategy and accelerate delivery.
Guide senior and staff engineers, lead architecture reviews, and raise the technical bar across data and ML domains.
Basic Qualifications
10+ years of software engineering experience focused on distributed systems, data platforms, and ML infrastructure.
Proven ability to influence technical direction across multiple teams and mentor senior/staff engineers.
Expertise in data processing frameworks and table formats (Spark, Flink, Iceberg) and orchestration tools (Airflow, Kubeflow).
Deep knowledge of ML infrastructure: model training pipelines, feature stores, registries, serving, and monitoring.
Strong programming skills in Python and at least one compiled language such as Java or Go.
Experience designing systems with scalability, reliability, and cost‑efficiency as first‑class concerns.
Cloud platform experience (AWS, GCP), familiarity with Kubernetes and modern data platform architectures.
Preferred Qualifications
Experience with compliance and governance in data/ML systems (auditability, privacy, explainability).
Familiarity with the data lakehouse paradigm and medallion architecture.
Salary USD 198,000 – 205,000
We are an Equal Opportunity Employer and do not discriminate on the basis of any protected class. All applications will receive consideration for employment without regard to legally protected characteristics.
The Company will provide reasonable accommodations as required by applicable federal, state, and/or local laws. Individuals seeking an accommodation for the application or interview process should e‑mail reasonable.accommodations@nytimes.com.
#J-18808-Ljbffr
New York, NY
Mission New York Times seeks the truth and helps people understand the world. Journalism is at our core, with a global newsroom and a strategy to make journalism worth paying for.
About the Role
We are seeking a Principal Software Engineer to lead the architecture and evolution of our data and machine learning infrastructure. This role will shape the foundation on which data‑driven products, analytics, and AI applications are built. You will design systems that enable large‑scale data processing, reliable pipelines, and efficient machine learning development—from feature engineering to real‑time model serving.
As a principal engineer, you will partner with product, data science, and platform teams to set technical direction, drive adoption of reusable frameworks, and mentor engineers across the organization. You will ensure that both data and ML platforms are scalable, reliable, cost‑efficient, and compliant with privacy and governance standards.
Key platform technologies include an AWS S3 data lake with Apache Iceberg, Confluent Kafka for real‑time streaming, Fivetran for ETL, Apache Flink for stream processing, AWS Glue (Spark) for ETL, dbt/Athena for analytical models, DynamoDB for low‑latency stores, and Google BigQuery for analytics.
This is a hybrid role based at our New York City headquarters, reporting to the Sr. Director of Engineering. Office attendance: 2+ days per week.
Responsibilities
Design and evolve infrastructure for data ingestion, storage, batch and streaming pipelines, and machine learning workflows.
Build systems for training, deploying, monitoring, and governing models, including feature stores, registries, and inference platforms.
Ensure end‑to‑end system reliability, monitoring, and cost transparency across data and ML workloads.
Deliver frameworks and APIs that enable engineers, analysts, and ML scientists to build and operate solutions independently.
Evaluate and introduce emerging technologies (vector databases, distributed training, orchestration frameworks, LLM stacks) and establish adoption guidelines.
Partner with platform, product, and engineering and ML science leaders to align on strategy and accelerate delivery.
Guide senior and staff engineers, lead architecture reviews, and raise the technical bar across data and ML domains.
Basic Qualifications
10+ years of software engineering experience focused on distributed systems, data platforms, and ML infrastructure.
Proven ability to influence technical direction across multiple teams and mentor senior/staff engineers.
Expertise in data processing frameworks and table formats (Spark, Flink, Iceberg) and orchestration tools (Airflow, Kubeflow).
Deep knowledge of ML infrastructure: model training pipelines, feature stores, registries, serving, and monitoring.
Strong programming skills in Python and at least one compiled language such as Java or Go.
Experience designing systems with scalability, reliability, and cost‑efficiency as first‑class concerns.
Cloud platform experience (AWS, GCP), familiarity with Kubernetes and modern data platform architectures.
Preferred Qualifications
Experience with compliance and governance in data/ML systems (auditability, privacy, explainability).
Familiarity with the data lakehouse paradigm and medallion architecture.
Salary USD 198,000 – 205,000
We are an Equal Opportunity Employer and do not discriminate on the basis of any protected class. All applications will receive consideration for employment without regard to legally protected characteristics.
The Company will provide reasonable accommodations as required by applicable federal, state, and/or local laws. Individuals seeking an accommodation for the application or interview process should e‑mail reasonable.accommodations@nytimes.com.
#J-18808-Ljbffr