West Texas Gas

Data Engineer

West Texas Gas, Dallas, Texas, United States, 75215

Job Details

Job Location WTG Dallas - Dallas, TX

Remote Type Fully Remote

Overview: We are looking for a Data Engineer to join a new and dynamic team focused on building next-generation technology solutions for a private equity-backed oil and gas firm undergoing digital transformation. This role presents a rare opportunity to be part of a greenfield initiative-designing, building, and owning the data infrastructure that powers our analytics, operations, and internal applications.

You'll be part of a lean, high-impact team where data is not an afterthought-it's central to our strategy. You will help define how data flows across the organization, implement modern architecture patterns, and ensure our business users and developers have access to reliable, well-modeled, and timely information.

What You'll Do As a Data Engineer, you'll play a critical role in shaping and operating the data infrastructure that powers decision-making, analytics, and automation across our organization. You'll work on a greenfield technology stack alongside developers, analysts, and AI engineers to design reliable data systems that meet real-world operational needs, and support the design and build of a new enterprise data warehouse from the ground up. • Model and structure enterprise data by translating business requirements into logical and physical schemas optimized for both transactional processing and analytical reporting. • Design and optimize custom application databases, creating performant tables, indexes, and views for scalable access across internal applications and BI tools. • Build and Manage a multi-layer (bronze, silver, gold) medallion architecture data warehouse to ensure reliable, high-quality data ingestion, transformation, and delivery. • Support AI initiatives by collaborating with AI engineers to deliver clean, structured data into AI/ML workflows and integrate vector databases for semantic search and embeddings. • Develop and maintain modern ELT/ETL pipelines to ingest data from APIs, flat files, web scraping workflows, and internal systems-automating data transformation and validation for consistency and quality. • Enable analytics teams by creating consumable datasets, materialized views, and well-documented schemas for tools like Power BI and Excel. • Support the design, performance, and reliability of the data warehouse by optimizing workflows and resolving pipeline or infrastructure issues. • Design and manage cloud-native data platforms using Azure services such as Data Factory, Synapse Analytics, Azure SQL, and Blob Storage. • Implement Infrastructure as Code (IaC) using ARM, Bicep, or Terraform to provision, secure, and automate data infrastructure in Azure. • Administer operational data systems, including configuring alerting, monitoring, backups, and recovery procedures to ensure system availability and resiliency. • Collaborate across teams-working closely with backend developers, analysts, GIS teams, and product stakeholders to ensure data availability and integration alignment. Document systems and mentor teammates, maintaining data architecture documentation and providing guidance on best practices, tools, and design patterns.

What You'll Need

• Proven ability to design normalized and denormalized data models for both operational and analytical purposes. • Hands-on experience with building and managing relational and analytical databases, including SQL Server, PostgreSQL, and Azure Synapse. • Strong SQL skills and experience with ETL/ELT tools like Azure Data Factory, Synapse Pipelines, and Airflow with Python. • Proficiency in Python for data manipulation and scripting, including use of libraries like pandas or PySpark. • Experience working with Parquet, JSON, flat files, and structured/unstructured data sources. • Strong knowledge of Azure cloud services related to data ingestion, storage, compute, and orchestration. • Experience building web data acquisition tools, including screen scraping and automation using Playwright, Puppeteer, BeautifulSoup, or Selenium. • Bachelor's degree in Computer Science, Data Engineering, or related field. • 3-5 years of experience in data engineering or a related role.