Sev1tech, Inc.
Senior Databricks Data Engineer - Data Lake
Sev1tech, Inc., Arlington, Virginia, United States, 22201
Sev1tech, Inc.
Senior Databricks Data Engineer - Data Lake
US-VA-Arlington
Overview
We are seeking a highly experienced and skilled Senior Data Lake Engineer to join our team. As the Senior Data Lake Engineer, you will play a critical role in establishing and configuring an enterprise-level Databricks solution to support our federal customer organization's data lake initiatives. This position offers a unique opportunity to work with cutting-edge technologies and shape the future of our federal customer's data infrastructure. If you are a highly skilled and experienced Senior Data Lake Engineer with expertise in Databricks and passion for building scalable and secure data lake solutions, we would like to hear from you. This position has an on-site requirement of 2 days a week in Arlington VA (In office requirement subject to change based on client request). Team is looking for "hands on keyboard" vs. leading a team Responsibilities:
Lead the design, implementation, and configuration of an enterprise Data Lake solution utilizing Databricks, ensuring scalability, reliability, and optimal performance. Collaborate with cross-functional teams to gather requirements, understand data integration needs, and define data lake architecture and governance policies. Establish and configure Databricks workspaces, clusters, and storage components, optimizing the solution for efficient data processing, query performance, and data governance. Design and implement data ingestion pipelines to efficiently extract, transform, and load data from various sources into the data lake using Databricks tools and services. Develop and maintain data lake security frameworks, including access controls, encryption solutions, and data masking techniques to protect sensitive data. Collaborate with data engineers and data scientists to optimize data pipelines, develop data transformations, and ensure data quality and integrity. Monitor and tune Databricks clusters and workloads to ensure performance, reliability, and cost optimization, utilizing automated scaling and resource management techniques. Implement best practices for data governance, data cataloging, metadata management, and data lineage within Databricks, adhering to regulatory and compliance requirements. Collaborate with infrastructure teams to ensure data lake infrastructure meets scalability and availability requirements, leveraging Databricks cluster management and AWS/Azure services. Develop and maintain documentation and guidelines related to the Databricks solution, including architecture diagrams, standards, and processes. Stay up to date with the latest advancements in Databricks, big data technologies, and cloud platforms, continuously evaluating and implementing new features and capabilities. Provide technical guidance and mentorship to junior data engineers, promoting best practices and fostering a culture of continuous learning and growth. Collaborate with stakeholders to understand their data analytics and reporting needs and develop scalable data models and data transformation processes to support these requirements. Support data lake-related incident resolutions, troubleshooting data quality issues, performance bottlenecks, and other data-related challenges. Collaborate with data governance and compliance teams to ensure data privacy, security, and compliance guidelines are adhered to within the data lake solution. Participate in the evaluation and selection of new tools, technologies, and services to enhance the data lake infrastructure. Qualifications:
Bachelor's degree in computer science, information technology, or a related field. Equivalent experience will also be considered. Proven experience in building and configuring enterprise-level data lake solutions using Databricks in an AWS or Azure environment. In-depth knowledge of Databricks architecture, including workspaces, clusters, storage, notebook development, and automation capabilities. Strong expertise in designing and implementing data ingestion pipelines, data transformations, and data quality processes using Databricks. Experience with big data technologies such as Apache Spark, Apache Hive, Delta Lake, and Hadoop. Solid understanding of data governance principles, data modeling, data cataloging, and metadata management. Hands-on experience with cloud platforms like AWS or Azure, including relevant services like S3, EMR, Glue, Data Factory, etc. Proficiency in SQL and one or more programming languages (Python, Scala, or Java) for data manipulation and transformation. Knowledge of data security and privacy best practices, including data access controls, encryption, and data masking techniques. Strong problem-solving and analytical skills, with the ability to identify and resolve complex data-related issues. Excellent interpersonal and communication skills, with the ability to collaborate effectively with technical and non-technical stakeholders. Experience in a senior or lead role, providing technical guidance and mentorship to junior team members. Relevant certifications such as Databricks Certified Developer or Databricks Certified Professional are highly desirable. Must be eligible to obtain a Department of Homeland Security EOD clearance (Requirements 1. US Citizenship, 2. Favorable Background Investigation) Clearance Preference:
Active DHS/CISA suitability - 1st priority Any DHS badge + DoD Top Secret - 2nd choice DoD Top Secret + willingness to obtain DHS/CISA suitability - 3rd choice (it can take 10-60 days to obtain suitability work can only begin once suitability is fully adjudicated).
#J-18808-Ljbffr
We are seeking a highly experienced and skilled Senior Data Lake Engineer to join our team. As the Senior Data Lake Engineer, you will play a critical role in establishing and configuring an enterprise-level Databricks solution to support our federal customer organization's data lake initiatives. This position offers a unique opportunity to work with cutting-edge technologies and shape the future of our federal customer's data infrastructure. If you are a highly skilled and experienced Senior Data Lake Engineer with expertise in Databricks and passion for building scalable and secure data lake solutions, we would like to hear from you. This position has an on-site requirement of 2 days a week in Arlington VA (In office requirement subject to change based on client request). Team is looking for "hands on keyboard" vs. leading a team Responsibilities:
Lead the design, implementation, and configuration of an enterprise Data Lake solution utilizing Databricks, ensuring scalability, reliability, and optimal performance. Collaborate with cross-functional teams to gather requirements, understand data integration needs, and define data lake architecture and governance policies. Establish and configure Databricks workspaces, clusters, and storage components, optimizing the solution for efficient data processing, query performance, and data governance. Design and implement data ingestion pipelines to efficiently extract, transform, and load data from various sources into the data lake using Databricks tools and services. Develop and maintain data lake security frameworks, including access controls, encryption solutions, and data masking techniques to protect sensitive data. Collaborate with data engineers and data scientists to optimize data pipelines, develop data transformations, and ensure data quality and integrity. Monitor and tune Databricks clusters and workloads to ensure performance, reliability, and cost optimization, utilizing automated scaling and resource management techniques. Implement best practices for data governance, data cataloging, metadata management, and data lineage within Databricks, adhering to regulatory and compliance requirements. Collaborate with infrastructure teams to ensure data lake infrastructure meets scalability and availability requirements, leveraging Databricks cluster management and AWS/Azure services. Develop and maintain documentation and guidelines related to the Databricks solution, including architecture diagrams, standards, and processes. Stay up to date with the latest advancements in Databricks, big data technologies, and cloud platforms, continuously evaluating and implementing new features and capabilities. Provide technical guidance and mentorship to junior data engineers, promoting best practices and fostering a culture of continuous learning and growth. Collaborate with stakeholders to understand their data analytics and reporting needs and develop scalable data models and data transformation processes to support these requirements. Support data lake-related incident resolutions, troubleshooting data quality issues, performance bottlenecks, and other data-related challenges. Collaborate with data governance and compliance teams to ensure data privacy, security, and compliance guidelines are adhered to within the data lake solution. Participate in the evaluation and selection of new tools, technologies, and services to enhance the data lake infrastructure. Qualifications:
Bachelor's degree in computer science, information technology, or a related field. Equivalent experience will also be considered. Proven experience in building and configuring enterprise-level data lake solutions using Databricks in an AWS or Azure environment. In-depth knowledge of Databricks architecture, including workspaces, clusters, storage, notebook development, and automation capabilities. Strong expertise in designing and implementing data ingestion pipelines, data transformations, and data quality processes using Databricks. Experience with big data technologies such as Apache Spark, Apache Hive, Delta Lake, and Hadoop. Solid understanding of data governance principles, data modeling, data cataloging, and metadata management. Hands-on experience with cloud platforms like AWS or Azure, including relevant services like S3, EMR, Glue, Data Factory, etc. Proficiency in SQL and one or more programming languages (Python, Scala, or Java) for data manipulation and transformation. Knowledge of data security and privacy best practices, including data access controls, encryption, and data masking techniques. Strong problem-solving and analytical skills, with the ability to identify and resolve complex data-related issues. Excellent interpersonal and communication skills, with the ability to collaborate effectively with technical and non-technical stakeholders. Experience in a senior or lead role, providing technical guidance and mentorship to junior team members. Relevant certifications such as Databricks Certified Developer or Databricks Certified Professional are highly desirable. Must be eligible to obtain a Department of Homeland Security EOD clearance (Requirements 1. US Citizenship, 2. Favorable Background Investigation) Clearance Preference:
Active DHS/CISA suitability - 1st priority Any DHS badge + DoD Top Secret - 2nd choice DoD Top Secret + willingness to obtain DHS/CISA suitability - 3rd choice (it can take 10-60 days to obtain suitability work can only begin once suitability is fully adjudicated).
#J-18808-Ljbffr