Salesforce, Inc..
We are seeking a highly skilled and motivated AI Platform Engineer with a focus on Feature Store development and management to join our growing AI/ML platform team. In this role, you will design, build, and scale the data and infrastructure components that power our machine learning ecosystem – enabling consistent, reliable, and real-time access to features across development, training, and production environments. You’ll collaborate closely with data scientists, ML engineers, and data platform teams to streamline feature engineering workflows and ensure seamless integration between offline and online data sources.
you’ll be expected to work across multiple domains including data architecture, distributed systems, software engineering, and MLOps. You will help define and implement best practices for feature registration, drift, governance, lineage tracking, and versioning, all while contributing to the CI/CD automation that supports feature deployment across environments.
What You’ll Do Key Responsibilities:
Feature Store Design & Development:
Architect, implement, and maintain a scalable feature store serving offline (batch), online (real-time), and streaming ML use cases.
Ecosystem Integration:
Build robust integrations between the feature store and ML ecosystem components such as data pipelines, model training workflows, model registry, and model serving infrastructure.
Streaming & Real-Time Data Processing:
Design and manage streaming pipelines using technologies like Kafka, Kinesis, or Flink to enable low-latency feature generation and real-time inference.
Feature Governance & Lineage:
Define and enforce governance standards for feature registration, metadata management, lineage tracking, and versioning to ensure data consistency and reusability.
Collaboration with ML Teams:
Partner with data scientists and ML engineers to streamline feature discovery, definition, and deployment workflows, ensuring reproducibility and efficient model iteration.
Data Pipeline Engineering:
Build and optimize ingestion and transformation pipelines that handle large-scale data while maintaining accuracy, reliability, and freshness.
CI/CD Automation:
Implement CI/CD workflows and infrastructure-as-code to automate feature store provisioning and feature promotion across environments (Dev → QA → Prod).
Cloud Infrastructure & Security:
Collaborate with platform and DevOps teams to ensure secure, scalable, and cost-effective operation of feature store and streaming infrastructure in cloud environments.
Monitoring & Observability:
Develop monitoring and alerting frameworks to track feature data quality, latency, and freshness across offline, online, and streaming systems.
What We’re Looking For
Bachelor’s or Master’s degree in Computer Science, Data Engineering, or related field.
5+ years of experience in data engineering, platform engineering, or MLOps roles.
Strong proficiency in Python and familiarity with distributed data frameworks such as Airflow, Spark or Flink.
Hands-on experience with feature store technologies (e.g., Feast, SageMaker Feature Store, Tecton, Databricks Feature Store, or custom implementations).
Experience with cloud data warehouse (e.g., snowflake) and transformation framework (e.g. dbt) for data modeling, transformation and feature computation in batch environment.
Expertise in streaming data platforms (e.g., Kafka, Kinesis, Flink) and real-time data processing architectures.
Experience with cloud environments (AWS preferred) and infrastructure-as-code tools (Terraform, CloudFormation).
Strong understanding of CI/CD automation, containerization (Docker, Kubernetes), and API-driven integration patterns.
Knowledge of data governance, lineage tracking, and feature lifecycle management best practices.
Excellent communication skills, a collaborative mindset, and a strong sense of ownership.
Preferred Qualifications (Bonus Points):
Experience with Salesforce Ecosystem
Open-source contributions or experience in feature store ecosystem development
Experience with unstructured databases(vector or graph databases) and RAG pipelines
Experience with context engineering including structuring data, prompts, and logic for AI systems, managing memory and external knowledge, etc
#J-18808-Ljbffr
you’ll be expected to work across multiple domains including data architecture, distributed systems, software engineering, and MLOps. You will help define and implement best practices for feature registration, drift, governance, lineage tracking, and versioning, all while contributing to the CI/CD automation that supports feature deployment across environments.
What You’ll Do Key Responsibilities:
Feature Store Design & Development:
Architect, implement, and maintain a scalable feature store serving offline (batch), online (real-time), and streaming ML use cases.
Ecosystem Integration:
Build robust integrations between the feature store and ML ecosystem components such as data pipelines, model training workflows, model registry, and model serving infrastructure.
Streaming & Real-Time Data Processing:
Design and manage streaming pipelines using technologies like Kafka, Kinesis, or Flink to enable low-latency feature generation and real-time inference.
Feature Governance & Lineage:
Define and enforce governance standards for feature registration, metadata management, lineage tracking, and versioning to ensure data consistency and reusability.
Collaboration with ML Teams:
Partner with data scientists and ML engineers to streamline feature discovery, definition, and deployment workflows, ensuring reproducibility and efficient model iteration.
Data Pipeline Engineering:
Build and optimize ingestion and transformation pipelines that handle large-scale data while maintaining accuracy, reliability, and freshness.
CI/CD Automation:
Implement CI/CD workflows and infrastructure-as-code to automate feature store provisioning and feature promotion across environments (Dev → QA → Prod).
Cloud Infrastructure & Security:
Collaborate with platform and DevOps teams to ensure secure, scalable, and cost-effective operation of feature store and streaming infrastructure in cloud environments.
Monitoring & Observability:
Develop monitoring and alerting frameworks to track feature data quality, latency, and freshness across offline, online, and streaming systems.
What We’re Looking For
Bachelor’s or Master’s degree in Computer Science, Data Engineering, or related field.
5+ years of experience in data engineering, platform engineering, or MLOps roles.
Strong proficiency in Python and familiarity with distributed data frameworks such as Airflow, Spark or Flink.
Hands-on experience with feature store technologies (e.g., Feast, SageMaker Feature Store, Tecton, Databricks Feature Store, or custom implementations).
Experience with cloud data warehouse (e.g., snowflake) and transformation framework (e.g. dbt) for data modeling, transformation and feature computation in batch environment.
Expertise in streaming data platforms (e.g., Kafka, Kinesis, Flink) and real-time data processing architectures.
Experience with cloud environments (AWS preferred) and infrastructure-as-code tools (Terraform, CloudFormation).
Strong understanding of CI/CD automation, containerization (Docker, Kubernetes), and API-driven integration patterns.
Knowledge of data governance, lineage tracking, and feature lifecycle management best practices.
Excellent communication skills, a collaborative mindset, and a strong sense of ownership.
Preferred Qualifications (Bonus Points):
Experience with Salesforce Ecosystem
Open-source contributions or experience in feature store ecosystem development
Experience with unstructured databases(vector or graph databases) and RAG pipelines
Experience with context engineering including structuring data, prompts, and logic for AI systems, managing memory and external knowledge, etc
#J-18808-Ljbffr