Healthcare Big Data Engineer
Molina Healthcare - Long Beach, California, us, 90899
Work at Molina Healthcare
Overview
- View job
Overview
We are looking for a talented and innovative Big Data Engineer to enhance our healthcare data team. This position is vital for the end-to-end design, development, and management of large-scale data systems specifically aimed at healthcare analytics. As an ideal candidate, you will architect and maintain robust, scalable, and secure data pipelines that drive critical decision-making throughout our organization. Responsibilities: Design and implement scalable, high-performance Big Data solutions for both structured and unstructured data from various sources. Build and manage batch and real-time data ingestion/extraction pipelines with tools like Kafka, Spark Streaming, and Talend. Create reusable and efficient ETL frameworks utilizing Python or Scala for high-volume data transformation and movement. Optimize data models to support diverse analytical and operational use cases, including healthcare claims and utilization data. Collaborate with data scientists, analysts, and business partners to translate requirements into effective data products. Deploy, monitor, and troubleshoot Hadoop-based infrastructure using tools such as Cloudera Manager, Ambari, and Zookeeper. Enforce data quality, security, and compliance standards through tools like Kerberos, Ranger, and Sentry. Implement web services and APIs (REST/SOAP) for seamless integration with applications and visualization platforms. Contribute to data governance initiatives, focusing on metadata management, lineage tracking, and quality assurance. Qualifications: Required: Minimum of 3 years in Big Data engineering, data integration, and pipeline development. Proficiency in Python, Java, or Scala for data transformation and system scripting. Expertise in Big Data tools such as Spark, Hive, Impala, Presto, Phoenix, Kylin, and Hadoop (HDFS, YARN). Experience in real-time stream-processing systems using Kafka, Storm, or Spark Streaming. Strong understanding of NoSQL databases like HBase and MemSQL, as well as traditional RDBMS (PostgreSQL, Oracle, SQL Server). Skilled in ETL design using tools like Talend or Informatica. Demonstrated experience in deploying and monitoring Big Data infrastructure using Ambari, Cloudera Manager, and Zookeeper. Thorough understanding of data warehousing, data validation, quality checks, metadata management, and governance. Preferred: 5+ years of progressive experience in Big Data engineering or analytics. Experience in the healthcare industry with knowledge of clinical, claims, or care management data. Familiarity with cloud platforms (AWS, Azure) and containerization tools (Docker, Kubernetes). Technical Environment: Big Data Ecosystem: Hadoop, Spark, Hive, Kafka, Presto, Impala, Phoenix, Kylin, Zookeeper Streaming & Messaging: Kafka, Spark Streaming, Storm ETL & Integration: Talend, Informatica, Python/Scala-based ETL Programming Languages: Python, Java, Scala, SQL Databases: HBase, MemSQL, PostgreSQL, Oracle, SQL Server Cloud & DevOps: AWS, Azure, Docker, Kubernetes, Git Security & Governance: Kerberos, Ranger, Sentry, Metadata Management Monitoring Tools: Ambari, Cloudera Manager APIs: REST, SOAP To all current Molina employees: If you are interested in applying for this position, please apply through the intranet job listing. Molina Healthcare provides a competitive benefits and compensation package. Molina Healthcare is an Equal Opportunity Employer (EOE) M/F/D/V.