GSK
Data Architect II
Location: GSK, Cambridge MA, USA (and other locations)
Join to apply for this role at GSK. This position supports the research data ecosystem and enables scientists to accelerate medical discovery through modern data architecture.
Overview The Onyx Research Data Tech organization is a full‑stack shop that powers data and analytics at scale, partnering with scientists to deliver tailored solutions.
Onyx focuses on:
Building a metadata‑enabled data experience for scientists, engineers, and decision‑makers
Providing AI/ML and data analysis environments to accelerate predictive capabilities
Engineering data at scale as a unified asset to unlock real‑time value
Responsibilities
Partner with Scientific Knowledge Engineering to develop physical data models for fit‑for‑purpose products
Design data architecture aligned with enterprise standards to promote interoperability
Collaborate with platform teams and data engineers to maintain architecture principles, standards, and guidelines
Design foundations that support GenAI workflows, including RAG, vector databases, and embedding pipelines
Work across business areas and stakeholders to ensure consistent implementation of architecture standards
Lead reviews and maintain architecture documentation and best practices for Onyx and stakeholders
Adopt a security‑first design with robust authentication and resilient connectivity
Provide leadership, subject matter expertise, and GSK knowledge to architecture and engineering teams, partners, and vendors
Qualifications Basic
Bachelor’s degree in computer science, engineering, data science, or similar discipline
5+ years of data architecture or engineering in pharma, healthcare, or life sciences R&D
3+ years defining architecture standards on Big Data platforms
3+ years experience with data warehouse, lake, and enterprise big data platforms
3+ years enterprise cloud data architecture (Azure or GCP) at scale
3+ years hands‑on relational, dimensional, and analytic experience with RDBMS, NoSQL, ETL, and ingestion protocols
Preferred
Master’s or PhD in relevant discipline
Deep knowledge of at least one programming language (Python, Scala, Java)
Experience with AI/ML data workflows: feature stores, vector databases, embedding pipelines, model serving architectures
Familiarity with GenAI/LLM patterns: RAG, prompt engineering, data preparation
Experience with GCP data/analytics stack: Spark, Dataflow, Dataproc, GCS, BigQuery
Experience with enterprise data tools: Ataccama, Collibra, Acryl
Experience with Agile frameworks: SAFe, Jira, Confluence, Azure DevOps
Experience applying CI/CD principles to data solutions
Strong communication skills to explain technical concepts to non‑technical stakeholders
Pharmaceutical, healthcare, or life sciences background
Compensation and Benefits
Annual base salary: $109,725‑$182,875 (region dependent)
Annual bonus and long‑term incentive program (share‑based)
Health care and other insurance benefits for employee and family
Retirement benefits, paid holidays, vacation, paid caregiver/parental and medical leave
Equal Opportunity GSK is an Equal Opportunity Employer. All qualified applicants will receive equal consideration for employment without regard to race, color, religion, sex (including pregnancy, gender identity, and sexual orientation), parental status, national origin, age, disability, genetic information, military service, or any basis prohibited under federal, state or local law.
#J-18808-Ljbffr
Join to apply for this role at GSK. This position supports the research data ecosystem and enables scientists to accelerate medical discovery through modern data architecture.
Overview The Onyx Research Data Tech organization is a full‑stack shop that powers data and analytics at scale, partnering with scientists to deliver tailored solutions.
Onyx focuses on:
Building a metadata‑enabled data experience for scientists, engineers, and decision‑makers
Providing AI/ML and data analysis environments to accelerate predictive capabilities
Engineering data at scale as a unified asset to unlock real‑time value
Responsibilities
Partner with Scientific Knowledge Engineering to develop physical data models for fit‑for‑purpose products
Design data architecture aligned with enterprise standards to promote interoperability
Collaborate with platform teams and data engineers to maintain architecture principles, standards, and guidelines
Design foundations that support GenAI workflows, including RAG, vector databases, and embedding pipelines
Work across business areas and stakeholders to ensure consistent implementation of architecture standards
Lead reviews and maintain architecture documentation and best practices for Onyx and stakeholders
Adopt a security‑first design with robust authentication and resilient connectivity
Provide leadership, subject matter expertise, and GSK knowledge to architecture and engineering teams, partners, and vendors
Qualifications Basic
Bachelor’s degree in computer science, engineering, data science, or similar discipline
5+ years of data architecture or engineering in pharma, healthcare, or life sciences R&D
3+ years defining architecture standards on Big Data platforms
3+ years experience with data warehouse, lake, and enterprise big data platforms
3+ years enterprise cloud data architecture (Azure or GCP) at scale
3+ years hands‑on relational, dimensional, and analytic experience with RDBMS, NoSQL, ETL, and ingestion protocols
Preferred
Master’s or PhD in relevant discipline
Deep knowledge of at least one programming language (Python, Scala, Java)
Experience with AI/ML data workflows: feature stores, vector databases, embedding pipelines, model serving architectures
Familiarity with GenAI/LLM patterns: RAG, prompt engineering, data preparation
Experience with GCP data/analytics stack: Spark, Dataflow, Dataproc, GCS, BigQuery
Experience with enterprise data tools: Ataccama, Collibra, Acryl
Experience with Agile frameworks: SAFe, Jira, Confluence, Azure DevOps
Experience applying CI/CD principles to data solutions
Strong communication skills to explain technical concepts to non‑technical stakeholders
Pharmaceutical, healthcare, or life sciences background
Compensation and Benefits
Annual base salary: $109,725‑$182,875 (region dependent)
Annual bonus and long‑term incentive program (share‑based)
Health care and other insurance benefits for employee and family
Retirement benefits, paid holidays, vacation, paid caregiver/parental and medical leave
Equal Opportunity GSK is an Equal Opportunity Employer. All qualified applicants will receive equal consideration for employment without regard to race, color, religion, sex (including pregnancy, gender identity, and sexual orientation), parental status, national origin, age, disability, genetic information, military service, or any basis prohibited under federal, state or local law.
#J-18808-Ljbffr