Purple Drive

Data Engineering

Purple Drive, Sunnyvale, California, United States, 94087

Key Responsibilities

Design, develop, and maintain

scalable backend services and APIs

using

Java, Python, Scala, Node.js, and GraphQL . Architect and implement

big data solutions

leveraging

Hadoop, Hive, Spark (Scala), Presto/Trino, and Data Lake concepts . Develop and optimize

data processing and streaming pipelines

using

Storm, Spark Streaming, Airflow, Luigi, and Automic . Collaborate with data scientists and ML engineers to integrate solutions with

Vertex AI

and other cloud-based AI/ML platforms. Implement containerized solutions with

Docker & Kubernetes

and manage deployment on

cloud environments (AWS/GCP/Azure) . Ensure

system reliability, scalability, and performance

through monitoring, testing, and optimization. Partner with cross-functional teams including product managers, data engineers, and DevOps to deliver high-quality solutions. Troubleshoot production issues, optimize system performance, and ensure data consistency and security. Required Skills & Qualifications

Bachelor's/Master's degree

in Computer Science, Engineering, or related field (or equivalent practical experience). 7-10 years

of experience in backend and data engineering. Proficiency in

Java, Python, Scala, and Node.js

for backend and API development. Strong experience with

GraphQL (GQL)

schema design and implementation. Expertise in

Hadoop ecosystem (HDFS, Hive, Spark with Scala) . Hands-on experience with

Presto/Trino

and

Data Lake architectures . Practical knowledge of

stream-processing frameworks

such as

Storm and Spark Streaming . Experience with

orchestration & workflow tools : Airflow, Luigi, Automic. Proficiency with

Kubernetes

and containerized deployments. Strong understanding of

cloud services

(GCP/AWS/Azure) and

Vertex AI . Excellent problem-solving, debugging, and optimization skills. Preferred Skills (Nice to Have)

Experience with

machine learning integration

and MLOps workflows. Knowledge of

NoSQL databases

(MongoDB, Cassandra, etc.). Exposure to

observability/monitoring tools

(Grafana, Prometheus, ELK, etc.). Familiarity with

Agile/Scrum methodologies .