Logo
Pccarx

Lead Data Architect - AI Driven Modeling & Governance

Pccarx, Houston

Save Job

Lead Data Architect - AI Driven Modeling & Governance Job Category : AI & Analytics Requisition Number : LEADD001483 Apply now Full-Time Hybrid Locations Showing 1 location PCCA Houston, TX 77099, USA Description What We're Looking For The Lead Data Architect serves as the organization's definitive hands-on authority on dimensional data modeling, artificial intelligence accelerated governance, and semantic layer architecture within Microsoft Fabric's medallion framework. You will architect and continuously optimize our enterprise data lake-house, orchestrating Bronze → Silver → Gold data transformations while publishing trusted data products that fuel both self-service analytics and advanced-AI workloads. As the primary data modeler and governance lead embedded within the AI-A team, you'll design foundational star schema structures that enable seamless Power BI/Copilot integrations for enterprise-wide self-service BI. A critical mandate involves validating, approving, and enhancing dimensional models, ensuring they meet rigorous performance, quality, and business-semantic standards. This role demands mastery of traditional star schema design, organization and meta-data tagging of unstructured data, coupled with expertise in leveraging modern agentic AI for automation and scale. With executive sponsorship backing your governance charter, you'll own the comprehensive framework, policies, and daily stewardship practices that maintain PCCA's data as accurate, secure, and highly reusable across all enterprise touchpoints. The position uniquely balances deep dimensional modeling expertise with cutting-edge AI capabilities while enforcing enterprise data stewardship that transforms raw data into strategic business assets. What You Will Do Translate business-drive use cases into technical data-management requirements and designs that fulfill the needs of end users. Design and evolve bronze-to-gold medallion layers in Microsoft Fabric OneLake, prioritizing star schema development for business ready semantic models in the Gold Layer. Implement slowly changing dimensions (SCD) and hierarchy management associated with clinical/operational KPIs. Maintain a centralized data dictionary with column-level business definitions, lineage, and quality rules. Conduct capacity planning for Delta Lake storage/Spark pools balancing cost/performance. Continually expand the content of the data lake-house with both structured and unstructured data sources, integrating all contents into a consistent data-management framework that facilitates self-service analytics, response-augmented generation (RAG) and agentic AI. AI-Driven Governance Deploy Microsoft Purview for automated metadata harvesting, classification, and sensitive data labeling. Implement agentic AI workflows to: Generate data quality test cases from semantic model definitions. Escalate governance violations to domain stewards. Chair data model review board with BI/AI-A teams to align on conformed dimensions. Data Product Stewardship Manage versioning and change control via Azure DevOps. Vertical-Slice Delivery Execute iterative “thin‑slice” releases - ingest ➜ model ➜ govern ➜ insight - to accelerate business value while maturing the enterprise model. Feature Store & ML Enablement Expose Gold‑layer tables and feature views to Abacus (or equivalent) for model training/inference. Orchestrate dbt/Fabric pipelines for auto‑refresh. Modern Data Ingestion & Integration Build GraphQL ingestion pipelines for EHR/CRM APIs using Azure API Management. Process unstructured clinical notes/imaging metadata through Fabric Dataflow Gen2 with RAG-enhanced indexing. Implement change data capture (CDC) for critical operational systems, including but not limited to ERP, CRM, eCommerce and internal business applications. AI-Enhanced BI Enablement Design Copilot-ready AI-semantic Power BI models with natural-language query optimizations. Implement RAG architecture using Azure AI Search to contextualize Power BI reports with EHR context. Develop self-service data contracts for gold-layer DataMart consumption. Mentor AI-A team engineers on dimensional-modeling and AI-facilitation best practice in data management. Partner with Chief Architect on platform-wide capacity/security planning. Train BI analysts on semantic layer usage/quality reporting. Who You Are Core Requirements Bachelor's or master's degree in computer science, Data Engineering, AI, or a related field. 8+ years in data architecture with 5+ years hands-on star schema design on cloud platforms. Data Architecture & Modeling: Deep expertise in designing, documenting and implementing medallion architectures and star-schema data warehouses. Proficient with dbt Core for modular transformations and experienced with clinical/healthcare data models (HL7/FHIR). Microsoft Data Platform: Expert-level knowledge and hands-on experience with the Microsoft data stack, including Microsoft Fabric administration, Azure Synapse, and Power BI. Data Integration & Processing: Proven ability to build robust data pipelines using GraphQL/API integration patterns, process unstructured data with Fabric Dataflow Gen2, and implement Change Data Capture (CDC) for critical systems. AI/ML Integration: Experience operationalizing AI/ML models, including implementing Retrieval-Augmented Generation (RAG) architectures with Azure AI Search and integrating AI workflows into data architectures. Governance & Security: Expertise in deploying Microsoft Purview for automated governance and a strong understanding of enterprise-level data governance and security frameworks. Leadership & Professional Skills Leadership & Mentorship: Proven experience leading technical teams, mentoring engineers, and driving organizational change through influence and collaboration. Communication & Documentation: Exceptional ability to create comprehensive technical documentation, such as architectural blueprints, and effectively communicate complex concepts to both technical, business and executive stakeholders. Methodology: Demonstrated success working in SCRUM Agile environments with a strong understanding of Azure DevOps practices, including CI/CD pipelines and automated testing. Preferred Qualifications: Certification in cloud platforms including Microsoft Azure, Microsoft Power Platform, and Microsoft Fabric. Preferred certifications: Azure Data Engineer and Azure Solutions Architect Expert. Proven ability to operationalize AI/ML pipelines (feature engineering, model registry, monitoring). Emerging AI tools and technologies. TOGAF or DAMA CDMP certification. Experience with agentic AI pipeline frameworks. Knowledge of HIPAA/21 CFR Part 11 compliance. Proven track record of aligning technical strategy with organizational goals. Experience in enterprise-level data governance and security frameworks. Familiarity with integrating structured, semi-structured, and unstructured data into unified systems. Exposure to version control systems like GitHub or Azure DevOps. Who We Are PCCA helps pharmacists and prescribers create personalized medicine that makes a difference in patients’ lives. As a complete resource for independent and health system compounding pharmacists, PCCA provides high-quality products, education and support to more than 3,000 pharmacies throughout the United States, Canada, Australia, and other countries around the world. Incorporated in 1981 by a network of pharmacists, PCCA has supported pharmacy compounding for more than 40 years. Learn more at Equal Opportunity Employer This employer is required to notify all applicants of their rights pursuant to federal employment laws.For further information, please review the Know Your Rights notice from the Department of Labor. #J-18808-Ljbffr