Codvo.ai
Job Description:
Data Engineer - Databricks (US Citizens or GC holders only)
Role Overview
We are looking for a highly skilled Full Stack Data Engineer with expertise in Databricks to design, develop, and optimize end-to-end data pipelines, data platforms, and analytics solutions. This role combines strong data engineering, cloud platform expertise, and software engineering skills to deliver scalable, production-grade solutions.
Key Responsibilities Design and develop ETL/ELT pipelines on Databricks (PySpark, Delta Lake, SQL). Architect data models (batch and streaming) for analytics, ML, and reporting. Optimize performance of large-scale distributed data processing jobs. Implement CI/CD pipelines for Databricks workflows using GitHub Actions, Azure DevOps, or similar. Build and maintain APIs, dashboards, or applications that consume processed data (full-stack aspect). Collaborate with data scientists, analysts, and business stakeholders to deliver solutions. Ensure data quality, lineage, governance, and security compliance. Required Skills & Qualifications
Core Databricks Skills: Strong in PySpark, Delta Lake, Databricks SQL. Experience with Databricks Workflows, Unity Catalog, and Delta Live Tables. Programming & Full Stack: Python (mandatory), SQL (expert). Exposure to Java/Scala (for Spark jobs). Knowledge of APIs, microservices (FastAPI/Flask), or basic front-end (React/Angular) is a plus. Cloud Platforms: Proficiency with at least one: Azure Databricks, AWS Databricks, or GCP Databricks. Knowledge of cloud storage (ADLS, S3, GCS), IAM, networking. DevOps & CI/CD: Git, CI/CD tools (GitHub Actions, Azure DevOps, Jenkins). Containerization (Docker, Kubernetes is a plus). Data Engineering Foundations: Data modeling (OLTP/OLAP). Batch & streaming data processing (Kafka, Event Hub, Kinesis). Data governance & compliance (Unity Catalog, Lakehouse security). Nice-to-Have
Experience with machine learning pipelines (MLflow, Feature Store). Knowledge of data visualization tools (Power BI, Tableau, Looker). Exposure to Graph databases (Neo4j) or RAG/LLM pipelines. Qualifications
Bachelor's or Master's in Computer Science, Data Engineering, or related field. 4-7 years of experience in data engineering, with at least 2 years on Databricks. Soft Skills
Strong problem-solving and analytical skills. Ability to work in fusion teams (business + engineering + AI/ML). Clear communication and documentation abilities.
About Us
At Codvo, we are committed to building scalable, future-ready data platforms that power business impact. We believe in a culture of innovation, collaboration, and growth, where engineers can experiment, learn, and thrive. Join us to be part of a team that solves complex data challenges with creativity and cutting-edge technology.
Data Engineer - Databricks (US Citizens or GC holders only)
Role Overview
We are looking for a highly skilled Full Stack Data Engineer with expertise in Databricks to design, develop, and optimize end-to-end data pipelines, data platforms, and analytics solutions. This role combines strong data engineering, cloud platform expertise, and software engineering skills to deliver scalable, production-grade solutions.
Key Responsibilities Design and develop ETL/ELT pipelines on Databricks (PySpark, Delta Lake, SQL). Architect data models (batch and streaming) for analytics, ML, and reporting. Optimize performance of large-scale distributed data processing jobs. Implement CI/CD pipelines for Databricks workflows using GitHub Actions, Azure DevOps, or similar. Build and maintain APIs, dashboards, or applications that consume processed data (full-stack aspect). Collaborate with data scientists, analysts, and business stakeholders to deliver solutions. Ensure data quality, lineage, governance, and security compliance. Required Skills & Qualifications
Core Databricks Skills: Strong in PySpark, Delta Lake, Databricks SQL. Experience with Databricks Workflows, Unity Catalog, and Delta Live Tables. Programming & Full Stack: Python (mandatory), SQL (expert). Exposure to Java/Scala (for Spark jobs). Knowledge of APIs, microservices (FastAPI/Flask), or basic front-end (React/Angular) is a plus. Cloud Platforms: Proficiency with at least one: Azure Databricks, AWS Databricks, or GCP Databricks. Knowledge of cloud storage (ADLS, S3, GCS), IAM, networking. DevOps & CI/CD: Git, CI/CD tools (GitHub Actions, Azure DevOps, Jenkins). Containerization (Docker, Kubernetes is a plus). Data Engineering Foundations: Data modeling (OLTP/OLAP). Batch & streaming data processing (Kafka, Event Hub, Kinesis). Data governance & compliance (Unity Catalog, Lakehouse security). Nice-to-Have
Experience with machine learning pipelines (MLflow, Feature Store). Knowledge of data visualization tools (Power BI, Tableau, Looker). Exposure to Graph databases (Neo4j) or RAG/LLM pipelines. Qualifications
Bachelor's or Master's in Computer Science, Data Engineering, or related field. 4-7 years of experience in data engineering, with at least 2 years on Databricks. Soft Skills
Strong problem-solving and analytical skills. Ability to work in fusion teams (business + engineering + AI/ML). Clear communication and documentation abilities.
About Us
At Codvo, we are committed to building scalable, future-ready data platforms that power business impact. We believe in a culture of innovation, collaboration, and growth, where engineers can experiment, learn, and thrive. Join us to be part of a team that solves complex data challenges with creativity and cutting-edge technology.