GlaxoSmithKline
Responsibilities
Partner with the Scientific Knowledge Engineering team to develop physical data models to build fit-for-purpose data products
Design data architecture aligned with enterprise-wide standards to promote interoperability
Collaborate with the platform teams and data engineers to maintain architecture principles, standards, and guidelines
Design data foundations that support GenAI workflows including RAG (Retrieval-Augmented Generation), vector databases, and embedding pipelines
Work across business areas and stakeholders to ensure consistent implementation of architecture standards
Lead reviews and maintain architecture documentation and best practices for Onyx and our stakeholders
Adopt security-first design with robust authentication and resilient connectivity
Provide best practices and leadership, subject matter, and GSK expertise to architecture and engineering teams composed of GSK FTEs, strategic partners, and software vendors
Basic Qualifications
Bachelor's degree in computer science, engineering, Data Science or similar discipline
5+ years of experience in data architecture, data engineering, or related fields in pharma, healthcare, or life sciences R&D
3+ years of experience defining architecture standards, patterns on Big Data platforms
3+ years of experience with data warehouse, data lake, and enterprise big data platforms
3+ years of experience with enterprise cloud data architecture (preferably Azure or GCP) and delivering solutions at scale
3+ years of hands-on relational, dimensional, and/or analytic experience (using RDBMS, dimensional, NoSQL data platform technologies, and ETL and data ingestion protocols)
Preferred Qualifications
Master's or PhD in computer science, engineering, Data Science or similar discipline
Deep knowledge and use of at least one common programming language: e.g., Python, Scala, Java
Experience with AI/ML data workflows: feature stores, vector databases, embedding pipelines, model serving architectures
Familiarity with GenAI/LLM data patterns: RAG architectures, prompt engineering data requirements, fine-tuning data preparation
Experience with GCP data/analytics stack: Spark, Dataflow, Dataproc, GCS, Bigquery
Experience with enterprise data tools: Ataccama, Collibra, Acryl
Experience with Agile frameworks: SAFe, Jira, Confluence, Azure DevOps
Experience applying CI/CD principles to data solution
Experience with Spark and RAG-based architectures for data science and ML use cases
Strong communication skills-ability to explain technical concepts to non-technical stakeholders
Pharmaceutical, healthcare, or life sciences background
Salary ranges: $109,725 to $182,875 (annual base salary for new hires in this position).
GSK is an Equal Opportunity Employer. All qualified applicants will receive equal consideration for employment without regard to race, color, religion, sex (including pregnancy, gender identity, sexual orientation), parental status, national origin, age, disability, genetic information, military service or any basis prohibited under federal, state or local law.
#J-18808-Ljbffr
Partner with the Scientific Knowledge Engineering team to develop physical data models to build fit-for-purpose data products
Design data architecture aligned with enterprise-wide standards to promote interoperability
Collaborate with the platform teams and data engineers to maintain architecture principles, standards, and guidelines
Design data foundations that support GenAI workflows including RAG (Retrieval-Augmented Generation), vector databases, and embedding pipelines
Work across business areas and stakeholders to ensure consistent implementation of architecture standards
Lead reviews and maintain architecture documentation and best practices for Onyx and our stakeholders
Adopt security-first design with robust authentication and resilient connectivity
Provide best practices and leadership, subject matter, and GSK expertise to architecture and engineering teams composed of GSK FTEs, strategic partners, and software vendors
Basic Qualifications
Bachelor's degree in computer science, engineering, Data Science or similar discipline
5+ years of experience in data architecture, data engineering, or related fields in pharma, healthcare, or life sciences R&D
3+ years of experience defining architecture standards, patterns on Big Data platforms
3+ years of experience with data warehouse, data lake, and enterprise big data platforms
3+ years of experience with enterprise cloud data architecture (preferably Azure or GCP) and delivering solutions at scale
3+ years of hands-on relational, dimensional, and/or analytic experience (using RDBMS, dimensional, NoSQL data platform technologies, and ETL and data ingestion protocols)
Preferred Qualifications
Master's or PhD in computer science, engineering, Data Science or similar discipline
Deep knowledge and use of at least one common programming language: e.g., Python, Scala, Java
Experience with AI/ML data workflows: feature stores, vector databases, embedding pipelines, model serving architectures
Familiarity with GenAI/LLM data patterns: RAG architectures, prompt engineering data requirements, fine-tuning data preparation
Experience with GCP data/analytics stack: Spark, Dataflow, Dataproc, GCS, Bigquery
Experience with enterprise data tools: Ataccama, Collibra, Acryl
Experience with Agile frameworks: SAFe, Jira, Confluence, Azure DevOps
Experience applying CI/CD principles to data solution
Experience with Spark and RAG-based architectures for data science and ML use cases
Strong communication skills-ability to explain technical concepts to non-technical stakeholders
Pharmaceutical, healthcare, or life sciences background
Salary ranges: $109,725 to $182,875 (annual base salary for new hires in this position).
GSK is an Equal Opportunity Employer. All qualified applicants will receive equal consideration for employment without regard to race, color, religion, sex (including pregnancy, gender identity, sexual orientation), parental status, national origin, age, disability, genetic information, military service or any basis prohibited under federal, state or local law.
#J-18808-Ljbffr