ClearanceJobs
Data Engineer
In today's rapidly evolving technology landscape, an organization's data has never been a more important aspect in achieving mission and business goals. Our data exploitation experts work with our clients to support their mission and business goals by creating and executing a comprehensive data strategy using the best technology and techniques, given the challenge. At Steampunk, our goal is to build and execute a data strategy for our clients to coordinate data collection and generation, to align the organization and its data assets in support of the mission, and ultimately to realize mission goals with the strongest effectiveness possible. For our clients, data is a strategic asset. They are looking to become a facts-based, data-driven, customer-focused organization. To help realize this goal, they are leveraging visual analytics platforms to analyze, visualize, and share information. At Steampunk you will design and develop solutions to high-impact, complex data problems, working with the best and data practitioners around. Our data exploitation approach is tightly integrated with Human-Centered Design and DevSecOps. We are looking for seasoned Data Engineer to work with our team and our clients to develop enterprise grade data platforms, services, and pipelines in Databricks. We are looking for more than just a "Data Engineer", but a technologist with excellent communication and customer service skills and a passion for data and problem solving. Lead and architect migrations of data using Databricks with focus on performance, reliability, and scalability. Assess and understand ETL jobs, workflows, data marts, BI tools, and reports Address technical inquiries concerning customization, integration, enterprise architecture and general feature/functionality of data products Experience working with database/data warehouse/data mart solutions in cloud (Preferably AWS. Alternatively Azure, GCP). Key must have skill sets - Databricks, SQL, PySpark / Python, AWS Support an Agile software development lifecycle You will contribute to the growth of our AI & Data Exploitation Practice! Required: Ability to hold a position of public trust with the US government. 2-4 years industry experience coding commercial software and a passion for solving complex problems. 2-4 years direct experience in Data Engineering with experience in tools such as: Big data tools: Databricks, Apache Spark, Delta Lake, etc. Relational SQL (Preferably T-SQL. Alternatively pgSQL, MySQL). Data pipeline and workflow management tools: Databricks Workflows, Airflow, Step Functions, etc. AWS cloud services: Databricks on AWS, S3, EC2, RDS (or Azure equivalents). Object-oriented/object function scripting languages: PySpark / Python, Java, C++, Scala, etc. Experience working with Data Lake house architecture and Delta Lake /Apache Iceberg Advanced working SQL knowledge and experience working with relational databases, query authoring and optimization (SQL) as well as working familiarity with a variety of databases. Experience manipulating, processing, and extracting value from large, disconnected datasets. Ability to inspect existing data pipelines, discern their purpose and functionality, and re-implement them efficiently in Databricks. Experience manipulating structured and unstructured data. Experience architecting data systems (transactional and warehouses). Experience the SDLC, CI/CD, and operating in dev/test/prod environments. Experience with data cataloging tools such as Informatica EDC, Unity Catalog, Collibra, Alati on, Purview, or DataZone is a plus. Commitment to data governance. Experience working in an Agile environment. Experience supporting project teams of developers and data scientists who build web-based interfaces, dashboards, reports, and analytics/machine learning models
In today's rapidly evolving technology landscape, an organization's data has never been a more important aspect in achieving mission and business goals. Our data exploitation experts work with our clients to support their mission and business goals by creating and executing a comprehensive data strategy using the best technology and techniques, given the challenge. At Steampunk, our goal is to build and execute a data strategy for our clients to coordinate data collection and generation, to align the organization and its data assets in support of the mission, and ultimately to realize mission goals with the strongest effectiveness possible. For our clients, data is a strategic asset. They are looking to become a facts-based, data-driven, customer-focused organization. To help realize this goal, they are leveraging visual analytics platforms to analyze, visualize, and share information. At Steampunk you will design and develop solutions to high-impact, complex data problems, working with the best and data practitioners around. Our data exploitation approach is tightly integrated with Human-Centered Design and DevSecOps. We are looking for seasoned Data Engineer to work with our team and our clients to develop enterprise grade data platforms, services, and pipelines in Databricks. We are looking for more than just a "Data Engineer", but a technologist with excellent communication and customer service skills and a passion for data and problem solving. Lead and architect migrations of data using Databricks with focus on performance, reliability, and scalability. Assess and understand ETL jobs, workflows, data marts, BI tools, and reports Address technical inquiries concerning customization, integration, enterprise architecture and general feature/functionality of data products Experience working with database/data warehouse/data mart solutions in cloud (Preferably AWS. Alternatively Azure, GCP). Key must have skill sets - Databricks, SQL, PySpark / Python, AWS Support an Agile software development lifecycle You will contribute to the growth of our AI & Data Exploitation Practice! Required: Ability to hold a position of public trust with the US government. 2-4 years industry experience coding commercial software and a passion for solving complex problems. 2-4 years direct experience in Data Engineering with experience in tools such as: Big data tools: Databricks, Apache Spark, Delta Lake, etc. Relational SQL (Preferably T-SQL. Alternatively pgSQL, MySQL). Data pipeline and workflow management tools: Databricks Workflows, Airflow, Step Functions, etc. AWS cloud services: Databricks on AWS, S3, EC2, RDS (or Azure equivalents). Object-oriented/object function scripting languages: PySpark / Python, Java, C++, Scala, etc. Experience working with Data Lake house architecture and Delta Lake /Apache Iceberg Advanced working SQL knowledge and experience working with relational databases, query authoring and optimization (SQL) as well as working familiarity with a variety of databases. Experience manipulating, processing, and extracting value from large, disconnected datasets. Ability to inspect existing data pipelines, discern their purpose and functionality, and re-implement them efficiently in Databricks. Experience manipulating structured and unstructured data. Experience architecting data systems (transactional and warehouses). Experience the SDLC, CI/CD, and operating in dev/test/prod environments. Experience with data cataloging tools such as Informatica EDC, Unity Catalog, Collibra, Alati on, Purview, or DataZone is a plus. Commitment to data governance. Experience working in an Agile environment. Experience supporting project teams of developers and data scientists who build web-based interfaces, dashboards, reports, and analytics/machine learning models