RSM Solutions, Inc
We are hiring a Data Integration Engineer to work onsite in Irvine, California. The role focuses on designing, building, and operating batch and streaming pipelines to move data from diverse sources into our Data Vault warehouse and ML cluster.
Responsibilities
Design, develop, and deploy incremental and full load pipelines using SSIS, Spark, Runbooks, and Azure Data Factory.
Build CDC solutions to minimize latency for downstream reporting and ML features.
Automate schema evolution and metadata population for hubs, links, and satellites.
Implement validation rules, unit tests, and data quality frameworks.
Maintain a requirements traceability matrix and publish data lineage documentation; collaborate with BA to translate user stories into technical interfaces.
Create CI/CD pipelines using Azure DevOps and Git.
Develop PowerShell/.NET utilities to orchestrate jobs, manage secrets, and push metrics to Grafana or Azure Monitor.
Benchmark and tune Spark, SQL, and SSIS performance; recommend indexing, partitioning, and cluster sizing strategies.
Stay current with emerging integration patterns and propose pilots for adoption.
Requirements
4+ years building data integration with MS SQL, SSIS, and Spark.
At least 2 years of ML cluster build experience.
At least 2 years of experience with Data Vault.
Strong TSQL, Python/Scala for Spark, PowerShell/.NET scripting; working knowledge of MongoDB aggregation, SSAS tabular models, and Git CI/CD.
Excellent problem solving, communication, and stakeholder management abilities.
Must be a US Citizen or Green Card holder; candidates on visas (H1, OPT, etc.) are not eligible.
Senior Level Mid‑Senior level
Employment Type Full‑time
Job Function Information Technology
Industries: Manufacturing and Retail Apparel and Fashion
Base Pay Range $100,000 - $120,000 per year
#J-18808-Ljbffr
Responsibilities
Design, develop, and deploy incremental and full load pipelines using SSIS, Spark, Runbooks, and Azure Data Factory.
Build CDC solutions to minimize latency for downstream reporting and ML features.
Automate schema evolution and metadata population for hubs, links, and satellites.
Implement validation rules, unit tests, and data quality frameworks.
Maintain a requirements traceability matrix and publish data lineage documentation; collaborate with BA to translate user stories into technical interfaces.
Create CI/CD pipelines using Azure DevOps and Git.
Develop PowerShell/.NET utilities to orchestrate jobs, manage secrets, and push metrics to Grafana or Azure Monitor.
Benchmark and tune Spark, SQL, and SSIS performance; recommend indexing, partitioning, and cluster sizing strategies.
Stay current with emerging integration patterns and propose pilots for adoption.
Requirements
4+ years building data integration with MS SQL, SSIS, and Spark.
At least 2 years of ML cluster build experience.
At least 2 years of experience with Data Vault.
Strong TSQL, Python/Scala for Spark, PowerShell/.NET scripting; working knowledge of MongoDB aggregation, SSAS tabular models, and Git CI/CD.
Excellent problem solving, communication, and stakeholder management abilities.
Must be a US Citizen or Green Card holder; candidates on visas (H1, OPT, etc.) are not eligible.
Senior Level Mid‑Senior level
Employment Type Full‑time
Job Function Information Technology
Industries: Manufacturing and Retail Apparel and Fashion
Base Pay Range $100,000 - $120,000 per year
#J-18808-Ljbffr