Logo
Xlysi LLC.

Databricks Engineer | W2 Only | Remote |

Xlysi LLC., Los Angeles, California, United States, 90079

Save Job

Key Responsibilities

Data Pipeline Development: Design and implement robust ETL/ELT pipelines using Databricks, PySpark, and Delta Lake to process structured and unstructured data efficiently.

Performance Optimization: Tune and optimize Databricks clusters and notebooks for performance, scalability, and cost-efficiency.

Collaboration: Work closely with data scientists, analysts, and business stakeholders to understand data requirements and deliver solutions that meet business needs.

Cloud Integration: Leverage cloud platforms (AWS, Azure, GCP) to build and deploy data solutions, ensuring seamless integration with existing infrastructure.

Data Modeling: Develop and maintain data models that support analytics and machine learning workflows.

Automation & Monitoring: Implement automated testing, monitoring, and alerting mechanisms to ensure data pipeline reliability and data quality.

Documentation & Best Practices: Maintain comprehensive documentation of data workflows and adhere to best practices in coding, version control, and data governance.

Required Qualifications

Experience: 5+ years in data engineering, with hands-on experience using Databricks and Apache Spark.

Programming Skills: Proficiency in Python and SQL; experience with Scala is a plus.

Cloud Platforms: Strong experience with cloud services such as AWS (e.g., S3, Glue, Redshift), Azure (e.g., Data Factory, Synapse), or GCP.

Data Engineering Tools: Familiarity with tools like Airflow, Kafka, and dbt.

Data Modeling: Experience in designing data models for analytics and machine learning applications.

Collaboration: Proven ability to work in cross-functional teams and communicate effectively with non-technical stakeholders.

Primary Skill Set

Databricks, Apache Spark, Python, SQL, Scala (optional), ETL/ELT development, Delta Lake.

Cloud platforms (AWS, Azure, GCP), Data modeling.

Cross-functional collaboration, Communication.

Secondary Skill Set

Airflow, dbt, Kafka, Hadoop, MLflow, Unity Catalog, Delta Live Tables.

Cluster optimization, Data governance, Security and compliance.

Databricks certifications.

#J-18808-Ljbffr