Logo
Cloudera, Inc.

Principal Software Engineer - Apache Spark

Cloudera, Inc., Atlanta, Georgia, United States, 30383

Save Job

Business Area: Engineering

Seniority Level: Director

Job Description: At Cloudera, we empower people to transform complex data into clear and actionable insights. With as much data under management as the hyperscalers, we're the preferred data partner for the top companies in almost every industry. Powered by the relentless innovation of the open source community, Cloudera advances digital transformation for the world's largest enterprises. Cloudera is looking for a software engineer with strong distributed systems expertise to work on the Cloudera distribution of Apache Spark. We are looking for senior engineers with experience in large-scale, distributed systems and data processing to help build our enterprise-grade system, designed for customers running Spark on thousands of nodes and processing petabytes of data. We are looking for a passionate individual that is ready to be a team lead for a team that is already supporting production systems at many of the biggest companies - and is looking to expand and take on even more projects to drive the next gen Data Engineering experience. You will be working with a distributed team, spread across the United States and Hungary, including multiple committers on Apache Spark. As a Principal Software Engineer, you will: Become to Cloudera's Data Engineering Experience

Contribute to Apache Spark

Develop new features in Scala/Java/Python on a modern platforms

Gain expertise in distributed data processing, from SQL planners and optimizers, to data layout and table formats like Apache Parquet and Iceberg, to fault tolerance in distributed systems.

Gain a solid understanding and deep technical knowledge of components across the Cloudera Data Engineering Experience stack, but focusing on Iceberg and Spark, which you can utilize in your daily tasks

Get to work on large scale distributed systems, from 100s to 1000s of nodes, in production clusters

Debug system level deployment issues, root cause analysis, perform system test analysis and resolve failures

Work on improving internal infrastructure

Collaborate with other team members and stakeholders

We are excited about you if you have: Bachelor's degree in Computer Science or equivalent, and 10+ years of experience; OR Master's degree and 6+ years of experience; OR PhD and 4+ years of experience

Experience leading and delivering complex product enhancements.

We use Java/Scala/Python/GoLang in projects, you should have a strong understanding of

at least one

of the following languages: Java, Scala, GoLang, Rust, C++, Python. And interested to learn the languages we're using.

Experience with systems design, development.

Passionate about programming, clean coding habits, attention to detail, and focus on quality

Strong oral and written communication skills.

Strong ability to research and solve problems independently without constant supervision

(Most importantly) Open-minded, desire to learn new things and build great products.

You may also have: Experience with distributed systems

Experience with SQL planners

Experience with using/developing Apache Spark or other related technologies.

Experience with large-scale, distributed systems design and development with an understanding of scaling, performance, and scheduling.

Solid experience with at least one cloud service (AWS, Azure, GCP, OpenShift)

Contributors to open-source projects.

What you can expect from us: Generous PTO Policy

Support work life balance with

Unplugged Days

Flexible WFH Policy

Mental & Physical Wellness programs

Phone and Internet Reimbursement program

Access to Continued Career Development

Comprehensive Benefits and Competitive Packages

Paid Volunteer Time

Employee Resource Groups

EEO/VEVRAA # LI-SZ1 #LI-Remote

#J-18808-Ljbffr