Apolis
Role: Databricks Architect
End Client - Epson Client Location - Boston /NJ 100% remote
Primary Skill: Data Modeler with Pharma Commercial Background. Experience on AWS Glue, Python
Description We are seeking a highly skilled Senior AWS Engineer / Architect with 13-14 years of experience in Pharma Commercial as data engineering and architecture. This role is ideal for an expert who brings deep hands-on experience with cloud platforms AWS Glue, Azure, or GCP , Python, Databricks, Apache Spark, and along with a strong foundation in designing and implementing scalable, secure, and high-performing data solutions. You will be responsible for driving the design and implementation of modern data architectures including data lakes, data warehouses, and streaming systems, while working closely with cross-functional teams to ensure alignment with business goals. This is a strategic technical role with a strong focus on execution and continuous improvement.
Responsibilities • Data Model & Design, scalable, and high-performing data architecture using AWS Glue, Python, Databricks and Apache Spark.• Define and implement data lake, data warehouse, and real-time streaming solutions. Develop architecture blueprints and documentation to support technical and business stakeholders. • Configure, monitor, and optimize Databricks clusters for performance and cost-efficiency. Apply best practices in cluster sizing, autoscaling, and performance tuning. • Lead, design and develop complex ETL/ELT pipelines for batch and real-time data ingestion using Databricks. Implement modular and reusable components for scalable data processing workflows. • Integrate Databricks with cloud-native storage solutions (e.g., AWS S3, Azure Data Lake, GCS) and other data services. Support hybrid and multi-cloud deployments were required. • Ensure compliance with security standards, IAM policies, encryption, and regulatory requirements. • Mentor junior engineers and guide them on best practices in Databricks and big data engineering. • Evaluate and recommend new tools, frameworks, and best practices in the Databricks ecosystem. • Stay up to date with the latest in Delta Lake, Databricks SQL, MLflow, and open-source Spark enhancements.
bout You (Desired Profile) • 13+ years of experience in data engineering, big data, or cloud data architecture roles. Strong expertise in AWS, Databricks, Apache Spark, and Delta Lake.• Proven experience designing and delivering large-scale data lakes, warehouses, and streaming platforms. • In-depth experience with cloud platforms (AWS, Azure, or GCP), including native services and integration patterns. • Strong coding skills in Python, Scala, or SQL. • Solid understanding of data governance, security, IAM, and regulatory compliance in cloud environments. • Experience with tools like Airflow, dbt, or CI/CD pipelines for data. • Excellent problem-solving, documentation, and communication skills. • Bachelor's degree in computer science, Data Engineering, or related field (Master's preferred). • Experience working in the pharmaceutical domain is a plus. • Certifications in Databricks, AWS, or Azure (or actively working toward them). • Exposure to Unity Catalog, Delta Lake, or Databricks SQL. • Knowledge of data governance, IAM, or cloud security practices.
Mandatory Qualifications • Certifications in AWS, Azure, or GCP. • Certifications in Data
End Client - Epson Client Location - Boston /NJ 100% remote
Primary Skill: Data Modeler with Pharma Commercial Background. Experience on AWS Glue, Python
Description We are seeking a highly skilled Senior AWS Engineer / Architect with 13-14 years of experience in Pharma Commercial as data engineering and architecture. This role is ideal for an expert who brings deep hands-on experience with cloud platforms AWS Glue, Azure, or GCP , Python, Databricks, Apache Spark, and along with a strong foundation in designing and implementing scalable, secure, and high-performing data solutions. You will be responsible for driving the design and implementation of modern data architectures including data lakes, data warehouses, and streaming systems, while working closely with cross-functional teams to ensure alignment with business goals. This is a strategic technical role with a strong focus on execution and continuous improvement.
Responsibilities • Data Model & Design, scalable, and high-performing data architecture using AWS Glue, Python, Databricks and Apache Spark.• Define and implement data lake, data warehouse, and real-time streaming solutions. Develop architecture blueprints and documentation to support technical and business stakeholders. • Configure, monitor, and optimize Databricks clusters for performance and cost-efficiency. Apply best practices in cluster sizing, autoscaling, and performance tuning. • Lead, design and develop complex ETL/ELT pipelines for batch and real-time data ingestion using Databricks. Implement modular and reusable components for scalable data processing workflows. • Integrate Databricks with cloud-native storage solutions (e.g., AWS S3, Azure Data Lake, GCS) and other data services. Support hybrid and multi-cloud deployments were required. • Ensure compliance with security standards, IAM policies, encryption, and regulatory requirements. • Mentor junior engineers and guide them on best practices in Databricks and big data engineering. • Evaluate and recommend new tools, frameworks, and best practices in the Databricks ecosystem. • Stay up to date with the latest in Delta Lake, Databricks SQL, MLflow, and open-source Spark enhancements.
bout You (Desired Profile) • 13+ years of experience in data engineering, big data, or cloud data architecture roles. Strong expertise in AWS, Databricks, Apache Spark, and Delta Lake.• Proven experience designing and delivering large-scale data lakes, warehouses, and streaming platforms. • In-depth experience with cloud platforms (AWS, Azure, or GCP), including native services and integration patterns. • Strong coding skills in Python, Scala, or SQL. • Solid understanding of data governance, security, IAM, and regulatory compliance in cloud environments. • Experience with tools like Airflow, dbt, or CI/CD pipelines for data. • Excellent problem-solving, documentation, and communication skills. • Bachelor's degree in computer science, Data Engineering, or related field (Master's preferred). • Experience working in the pharmaceutical domain is a plus. • Certifications in Databricks, AWS, or Azure (or actively working toward them). • Exposure to Unity Catalog, Delta Lake, or Databricks SQL. • Knowledge of data governance, IAM, or cloud security practices.
Mandatory Qualifications • Certifications in AWS, Azure, or GCP. • Certifications in Data