Purple Drive

Data Engineer

Purple Drive, Sunnyvale, California, United States, 94087

****Preferred location is Sunnyvale, CA although will accept candidates from Bentonville, AR.

Job Description:

We are seeking a skilled

Data Engineer

with strong expertise in

Big Data, Cloud platforms, and distributed data systems . The ideal candidate will have hands-on experience in designing, building, and optimizing data pipelines, API integrations, and real-time stream-processing systems.

Key Responsibilities:

Design, develop, and optimize large-scale data pipelines and ETL workflows. Work with

Java, Python, and Scala

to build scalable data solutions. Develop APIs and integrate with systems using

Node.js, GQL, and RESTful services . Implement big data solutions leveraging

Hadoop, Hive, Spark (Scala), Presto/Trino, and Data Lake architectures . Deploy and manage workflows with

Airflow, Luigi, Automic , and similar orchestration tools. Build and maintain

real-time data streaming systems

using

Storm, Spark-Streaming, and Kafka . Utilize

Vertex AI and Cloud services (AWS/GCP/Azure)

for advanced analytics and ML integration. Ensure system reliability, scalability, and performance in distributed environments. Collaborate with cross-functional teams (data scientists, analysts, and engineers) to deliver high-quality data solutions. Apply best practices in CI/CD, Kubernetes-based deployments, and monitoring. Required Skills:

Strong programming skills in

Java, Python, and Scala . Expertise in

big data frameworks : Hadoop, Hive, Spark (Scala). Hands-on with

API development

(REST, GraphQL, Node.js). Experience with

stream-processing tools : Kafka, Storm, Spark-Streaming. Proficiency with

workflow orchestration : Airflow, Luigi, Automic. Knowledge of

Presto/Trino

and distributed SQL query engines. Cloud experience (AWS, GCP, or Azure), with exposure to

Vertex AI . Strong understanding of

Data Lake and data warehousing concepts . Experience with Kubernetes for container orchestration.