Logo
Compugra Systems

DATA ARCHITECT

Compugra Systems, California, Missouri, United States, 65018

Save Job

DATA ARCHITECT Location : California Bay Area 1 Year Role expectations The Data Architect will be responsible for designing, implementing, and maintaining scalable data architectures on the Databricks platform with a strong understanding of SAP data structures, especially master data. The role requires hands‑on experience in data engineering, governance, and platform administration, as well as the ability to guide development teams through best practices, architecture decisions, and code reviews.

Skills Technical Skills

8 15 years in data engineering/architecture, with 3 5 years specifically in

Databricks .

Deep knowledge of:

PySpark , Spark SQL, Delta Lake

Unity Catalog , cluster management, lakehouse governance

Azure/AWS/Google Cloud Platform cloud architecture

Strong experience with

SAP data :

Extracting data from ECC/S4/BW

Understanding of SAP tables, master data structures, and business logic

Experience with IDOCs, BAPIs, ODP/ODQ sources

Strong MDM experience:

Master data modelling

Data quality frameworks

Metadata management

Golden record management

CI/CD: Git, Azure DevOps, GitHub Actions or similar.

Databricks Workflows / Jobs orchestration.

Exposure to planning systems such as

SAP IBP/APO

(preferred but not required).

Soft Skills

Strong communication and documentation skills.

Ability to interact with business and technical teams.

Problem‑solving with a focus on performance, reliability, and scalability.

Leadership mindset with ability to guide and upskill teams.

Detailed skills Architecture & Solution Design

Design end‑to‑end data architectures leveraging

Databricks Lakehouse Platform

(Delta Lake, Unity Catalog, Lakehouse Governance).

Develop scalable ingestion, transformation, and consumption patterns for SAP data (ECC/S4, BW, IBP, APO, etc.).

Define data models for

Master Data Management (MDM)

Material, Customer, Vendor, BOM, Plant, Cost Center, Profit Center, etc.

Create logical/physical models aligned with business processes (planning, procurement, manufacturing, finance).

Databricks Platform Administration

Manage

workspace configuration , clusters, secrets, networking, and access control.

Set up and maintain

Unity Catalog , catalogs, schemas, storage credentialing, and data lineage.

Develop CI/CD frameworks for Databricks repos, workflows, and environment promotions.

Monitor platform performance, optimize cluster sizing, and implement cost‑control measures.

Infrastructure & Environment Setup

Design and configure environments (Dev/Test/Prod) across Azure/AWS/Google Cloud Platform Databricks.

Set up pipelines for SAP data ingestion using

ADF, Synapse, Data Factory, AWS Glue, SAP connectors, ODP/ODQ, RFC/IDOC/BAPI mechanisms .

Architect secure storage layers (Bronze/Silver/Gold) with Delta Lake best practices.

Ensure integration with enterprise security standards Key Vaults, ADLS/S3, IAM, networking.

Data Governance & MDM

Implement governance frameworks around

data quality, lineage, cataloging, and stewardship .

Define master data validations, deduplication logic, survivorship rules, and versioning.

Implement data quality rules using

Delta Live Tables (DLT), expectations, and audits .

Collaborate with business teams to define golden records and standardized master data models.

Best Practices, Standards & Reviews

Create coding standards for PySpark, SQL, Delta Lake, and ETL/ELT pipelines.

Review developer code with focus on: Query optimization Efficient Delta Lake operations (MERGE, OPTIMIZE, ZORDER) Cluster cost optimization Error handling and logging patterns Define reusable frameworks for ingestion, transformation, and reconciliation.

Development Guidance & Team Enablement

Mentor developers on Databricks architecture, PySpark patterns, and SAP data structures.

Provide technical leadership in design sessions and sprint planning.

Conduct knowledge sessions on best practices and common pitfalls.

Troubleshoot complex data pipeline issues across SAP Databricks

#J-18808-Ljbffr