Senior Java Spark Developer
Diverse Lynx - Austin, Texas, us, 78716
Work at Diverse Lynx
Overview
- View job
Overview
Senior Java Spark Developer
with expertise in
Java, Apache Spark, and the Cloudera Hadoop Ecosystem
to design and develop large-scale data processing applications. The ideal candidate will have
strong hands-on experience in Java-based Spark development, distributed computing, and performance optimization
for handling big data workloads.
Key Responsibilities: •
Java & Spark Development:
Develop, test, and deploy
Java-based Apache Spark
applications for large-scale data processing. Optimize and fine-tune
Spark jobs
for
performance, scalability, and reliability . Implement
Java-based microservices and APIs
for data integration. •
Big Data & Cloudera Ecosystem:
Work with
Cloudera Hadoop components
such as
HDFS, Hive, Impala, HBase, Kafka, and Sqoop . Design and implement high-performance
data storage and retrieval solutions . Troubleshoot and resolve
performance bottlenecks
in Spark and Cloudera platforms. •
Collaboration & Data Engineering:
Collaborate with
data scientists, business analysts, and developers
to understand data requirements. Implement
data integrity, accuracy, and security
best practices across all data processing tasks. Work with
Kafka, Flume, Oozie, and Nifi
for real-time and batch data ingestion. •
Software Development & Deployment:
Implement
version control (Git) and CI/CD pipelines (Jenkins, GitLab)
for Spark applications. Deploy and maintain
Spark applications
in
cloud or on-premises Cloudera environments . Required Skills & Experience:
8+ years
of experience in
application development , with a strong background in
Java and Big Data processing . Strong hands-on experience in
Java, Apache Spark, and Spark SQL
for distributed data processing. Proficiency in
Cloudera Hadoop (CDH) components
such as
HDFS, Hive, Impala, HBase, Kafka, and Sqoop . Experience building and optimizing
ETL pipelines
for large-scale data workloads. Hands-on experience with
SQL & NoSQL databases
like
HBase, Hive, and PostgreSQL . Strong knowledge of
data warehousing concepts, dimensional modeling, and data lakes . Proven ability to
troubleshoot and optimize Spark applications
for high performance. Familiarity with
version control tools
(Git, Bitbucket) and
CI/CD pipelines
(Jenkins, GitLab). Exposure to
real-time data streaming technologies
like Kafka, Flume, Oozie, and Nifi. Strong
problem-solving skills , attention to detail, and ability to work in a fast-paced environment.
Diverse Lynx LLC is an Equal Employment Opportunity employer. All qualified applicants will receive due consideration for employment without any discrimination. All applicants will be evaluated solely on the basis of their ability, competence and their proven capability to perform the functions outlined in the corresponding role. We promote and support a diverse workforce across all levels in the company.