CloudTech Innovations
3 days ago Be among the first 25 applicants
Get AI-powered advice on this job and more exclusive features.
We are seeking a skilled Scala Developer in a W-2 role
with strong hands‑on experience building
batch and streaming data pipelines
using
Databricks
and
Apache Spark . The ideal candidate will contribute to the development of cloud‑native, high‑throughput data processing systems that power business intelligence, real‑time analytics, and machine learning workloads. This role requires solid functional programming skills, an understanding of distributed systems, and experience with cloud platforms such as AWS, Azure, or GCP.
Key Responsibilities
Design and develop robust
Scala‑based batch and streaming pipelines
using
Apache Spark
on
Databricks
Implement and optimize ETL/ELT workflows
handling both structured and semi‑structured data
Build real‑time streaming applications leveraging
Spark Structured Streaming
integrated with
Kafka ,
Kinesis , or
Event Hubs
Use Delta Lake
for scalable, ACID‑compliant storage on cloud data lakes (S3, ADLS, GCS)
Collaborate with architects, analysts, and DevOps teams to ensure pipelines are scalable, reliable, and production‑ready
Develop
Databricks notebooks , configure
job clusters , and manage workflow scheduling and orchestration
Tune Spark performance (shuffle, caching, partitioning) to support large‑scale workloads efficiently
Support
data quality checks , error handling, and lineage tracking within production pipelines
Follow DevOps and CI/CD best practices using
Git ,
GitHub Actions/Jenkins , and
Terraform
Required Qualifications
Bachelor’s or Master’s in Computer Science, Data Engineering, or a related field
6+ years of experience in backend or data engineering roles, with a focus on
Scala
and
Apache Spark
Strong hands‑on experience with Databricks , including notebook development and job orchestration
Proven experience building and supporting both
batch
and
streaming
pipelines at scale
Proficiency with
Delta Lake ,
Parquet , and performance tuning techniques
Familiarity with at least one major cloud platform:
AWS ,
Azure , or
GCP
Integration experience with real‑time messaging tools: Kafka, Kinesis, or Azure Event Hubs
Exposure to infrastructure‑as‑code tools like Terraform
and version control with
Git
Strong collaboration skills and experience working in Agile/Scrum environments
Preferred Skills
Experience with
Unity Catalog
and enterprise‑grade data governance in Databricks
Familiarity with
SQL ,
Python , or
Bash
scripting for supporting cross‑platform workloads
Experience with orchestration tools like
Apache Airflow ,
dbt , or
Databricks Workflows
Preferred certifications such as Databricks Certified Developer ,
AWS Big Data , or
Azure Data Engineer
Background in industries such as healthcare ,
finance , or
retail
where data compliance and scale are critical
Seniority level Mid‑Senior level
Employment type Contract
Job function Engineering and Information Technology
Industries IT Services and IT Consulting
Referrals increase your chances of interviewing at CloudTech Innovations by 2x
#J-18808-Ljbffr
Get AI-powered advice on this job and more exclusive features.
We are seeking a skilled Scala Developer in a W-2 role
with strong hands‑on experience building
batch and streaming data pipelines
using
Databricks
and
Apache Spark . The ideal candidate will contribute to the development of cloud‑native, high‑throughput data processing systems that power business intelligence, real‑time analytics, and machine learning workloads. This role requires solid functional programming skills, an understanding of distributed systems, and experience with cloud platforms such as AWS, Azure, or GCP.
Key Responsibilities
Design and develop robust
Scala‑based batch and streaming pipelines
using
Apache Spark
on
Databricks
Implement and optimize ETL/ELT workflows
handling both structured and semi‑structured data
Build real‑time streaming applications leveraging
Spark Structured Streaming
integrated with
Kafka ,
Kinesis , or
Event Hubs
Use Delta Lake
for scalable, ACID‑compliant storage on cloud data lakes (S3, ADLS, GCS)
Collaborate with architects, analysts, and DevOps teams to ensure pipelines are scalable, reliable, and production‑ready
Develop
Databricks notebooks , configure
job clusters , and manage workflow scheduling and orchestration
Tune Spark performance (shuffle, caching, partitioning) to support large‑scale workloads efficiently
Support
data quality checks , error handling, and lineage tracking within production pipelines
Follow DevOps and CI/CD best practices using
Git ,
GitHub Actions/Jenkins , and
Terraform
Required Qualifications
Bachelor’s or Master’s in Computer Science, Data Engineering, or a related field
6+ years of experience in backend or data engineering roles, with a focus on
Scala
and
Apache Spark
Strong hands‑on experience with Databricks , including notebook development and job orchestration
Proven experience building and supporting both
batch
and
streaming
pipelines at scale
Proficiency with
Delta Lake ,
Parquet , and performance tuning techniques
Familiarity with at least one major cloud platform:
AWS ,
Azure , or
GCP
Integration experience with real‑time messaging tools: Kafka, Kinesis, or Azure Event Hubs
Exposure to infrastructure‑as‑code tools like Terraform
and version control with
Git
Strong collaboration skills and experience working in Agile/Scrum environments
Preferred Skills
Experience with
Unity Catalog
and enterprise‑grade data governance in Databricks
Familiarity with
SQL ,
Python , or
Bash
scripting for supporting cross‑platform workloads
Experience with orchestration tools like
Apache Airflow ,
dbt , or
Databricks Workflows
Preferred certifications such as Databricks Certified Developer ,
AWS Big Data , or
Azure Data Engineer
Background in industries such as healthcare ,
finance , or
retail
where data compliance and scale are critical
Seniority level Mid‑Senior level
Employment type Contract
Job function Engineering and Information Technology
Industries IT Services and IT Consulting
Referrals increase your chances of interviewing at CloudTech Innovations by 2x
#J-18808-Ljbffr