Arcus Search
Take the lead in building the data backbone of a high-performance, data-driven organization—where petabyte-scale processing, high-performance computing, and real-time insights power strategic decisions every day.This is more than a data engineering role. You’ll scale a next-gen data platform that merges HPC principles, streaming infrastructure, and modern data lakes to support some of the most demanding analytical workloads in the industry.
Drive innovation across real-time streaming and batch pipelines, leveraging cloud-native and hybrid HPC environments
Collaborate across disciplines to solve deep technical challenges, from data lake optimization to distributed compute orchestration
Be part of a culture that prizes autonomy, technical excellence, and business impact
What You’ll Tackle:
Design and implement highly scalable, low-latency data pipelines and storage layers (including columnar formats, object stores, and metadata-driven lakehouses) Architect data infrastructure across cloud and on-prem HPC clusters, optimized for both structured and unstructured data at scale Support high-throughput, compute-intensive workflows with Spark, Flink, Trino, Kafka, and Iceberg Contribute to best practices for data governance, performance tuning, and multi-tenant data access in complex environments Mentor engineers and champion technical standards across the data domain Experience/Skills:
Expertise in data engineering at enterprise or compute-cluster scale, including performance tuning and optimization Proficiency in Python, Scala, or Java, with strong experience in Spark, Flink, or similar distributed engines Deep understanding of streaming systems, lakehouse architectures, and cloud/HPC hybrid infrastructure Hands-on experience with AWS, Azure, or GCP, and tools like Airflow, Iceberg, Kafka, or Trino Ability to thrive in complex environments with high-volume, high-frequency data, and a strong ownership mindset
#J-18808-Ljbffr
Design and implement highly scalable, low-latency data pipelines and storage layers (including columnar formats, object stores, and metadata-driven lakehouses) Architect data infrastructure across cloud and on-prem HPC clusters, optimized for both structured and unstructured data at scale Support high-throughput, compute-intensive workflows with Spark, Flink, Trino, Kafka, and Iceberg Contribute to best practices for data governance, performance tuning, and multi-tenant data access in complex environments Mentor engineers and champion technical standards across the data domain Experience/Skills:
Expertise in data engineering at enterprise or compute-cluster scale, including performance tuning and optimization Proficiency in Python, Scala, or Java, with strong experience in Spark, Flink, or similar distributed engines Deep understanding of streaming systems, lakehouse architectures, and cloud/HPC hybrid infrastructure Hands-on experience with AWS, Azure, or GCP, and tools like Airflow, Iceberg, Kafka, or Trino Ability to thrive in complex environments with high-volume, high-frequency data, and a strong ownership mindset
#J-18808-Ljbffr