Logo
Walmart

Senior Manager, Software Engineering - ML

Walmart, Sunnyvale, California, United States, 94086

Save Job

Senior Manager, Software Engineering - Search ML Infrastructure

As a Senior Engineering Manager on the Walmart International Search ML Infrastructure team, you will be responsible for leading and managing a team of data platform engineers and ML engineers in building foundational platforms that power ML and search capabilities across all international markets. You will partner closely with data science teams, search engineering, Product, and international stakeholders to ensure we build scalable, multi-tenant, and high-performance infrastructure that accelerates time-to-market for ML models and enables data-driven decision making at global scale. About Team: Focusing on customer, associate and business needs, this team works with Walmart International, which includes more than 5,200 retail units, operating in 23 countries such as Canada, Central America, Chile, China, India, Mexico and South Africa to name a few. The Search ML Infrastructure team owns the foundational data pipelines, ML serving infrastructure, evaluation platforms, and experimentation frameworks that enable search and ML capabilities across all international markets. What you'll do: Provide guidance and lead a team of data platform engineers and ML engineers to build quality ML infrastructure solutions that process omni-channel engagement data, serve ML models at scale, and enable sophisticated evaluation and experimentation frameworks Ensure Software Quality Engineering by keeping the engineers up to date on best practices for ML operations, automated testing for data pipelines and ML systems, comprehensive monitoring and alerting, and taking a proactive approach to production ML infrastructure support Develops and implements ML platform development strategies using Agile development processes by using iterative and incremental development processes; collaborating between self-organizing, cross-functional teams for requirements and solutions; and promoting adaptive planning, evolutionary development and delivery for rapid and flexible response to changing ML and data science needs Drives the execution of multiple ML infrastructure initiatives and projects by identifying data science team pain points and operational needs; developing and communicating technical roadmaps and priorities; removing barriers and obstacles that impact ML model deployment speed; providing resources and technical guidance; identifying performance standards for data pipelines and ML systems; measuring progress and adjusting performance accordingly; developing contingency plans for production ML systems; and demonstrating adaptability and supporting continuous learning Work closely with data science teams, search engineering leads, international market stakeholders, and Product teams to ensure ML infrastructure work is aligned with business objectives and accelerates ML model time-to-market Own end-to-end ML infrastructure including parsers/stitchers for omni-channel data, multi-tenant data pipelines, feature engineering platforms, model serving infrastructure, evaluation systems (including site crawling, query sampling, NDCG evaluation), and experimentation frameworks What you'll bring: Strong senior level engineering leadership expertise in medium to large size technology companies with focus on ML infrastructure, data platforms, or search systems 12+ years of total experience of which 8+ years in ML infrastructure, data platform engineering, or platform development with hands-on technical contributions 12+ years of hands-on experience with data pipeline technologies (Apache Kafka, Apache Spark, Apache Flink, Apache Airflow), distributed data processing, and real-time streaming systems Expert-level hands-on experience with ML model serving infrastructure (TensorFlow Serving, MLflow, Kubeflow), feature engineering platforms, and multi-tenant ML operations at scale Extensive experience with containerization technology (Kubernetes, Docker), microservices architecture, and distributed computing for ML workloads Prior hands-on development experience in Python/Java/Scala for building data pipelines, ML serving systems, and evaluation frameworks at enterprise scale Deep technical knowledge of cloud platforms (AWS, GCP, Azure) with focus on ML services, managed data services, and infrastructure automation Proven experience in hiring and mentoring world-class data engineering and ML engineering teams Leading large scale ML infrastructure systems in fast-paced, multi-market environments Experience with A/B testing frameworks, statistical analysis, search evaluation metrics (NDCG, MAP, MRR), and experimentation platforms Hands-on experience with monitoring and observability tools (Prometheus, Grafana, ELK stack) for data pipelines and ML systems Experience in working in large, distributed teams using CI/CD for ML operations and agile methodologies Excellent oral and verbal communication skills with ability to collaborate across international markets and technical stakeholders BS or MS degree in Computer Science, Data Engineering, or a related technical field and 12+ years of industry experience About Walmart Global Tech Imagine working in an environment where one line of code can make life easier for hundreds of millions of people. That's what we do at Walmart Global Tech. We're a team of software engineers, data scientists, cybersecurity experts and service professionals within the world's leading retailers who make an epic impact and are at the forefront of the next retail disruption. People are why we innovate, and people power our innovations. We are people-led and tech-empowered. We train our team in the skillsets of the future and bring in experts like you to help us grow. We have roles for those chasing their first opportunity as well as those looking for the opportunity that will define their career. Here, you can kickstart a great career in tech, gain new skills and experience for virtually every industry, or leverage your expertise to innovate on a scale that impacts millions and reimagine the future of retail. Flexible, hybrid work: We use a hybrid way of working that is primarily in office coupled with virtual when not onsite. Our campuses serve as a hub to enhance collaboration, bring us together for purpose and deliver on business needs. This approach helps us make quicker decisions, remove the location barriers across our global team and be more flexible in our personal lives. Benefits: Beyond our great compensation package, you can receive incentive awards for your performance. Other great perks include 401(k) match, stock purchase plan, paid maternity and parental leave, PTO, multiple health plans, and much more. Equal Opportunity Employer: Walmart, Inc. is an Equal Opportunity Employer -- By Choice. We believe we are best equipped to help our associates, customers and the communities we serve live better when we really know them. That means understanding, respecting and valuing unique styles, experiences, identities, ideas and opinions -- while being inclusive of all people. Minimum Qualifications: Option 1: Bachelor's degree in computer science, computer engineering, computer information systems, software engineering, or related area and 5 years' experience in software engineering or related area. Option 2: 7 years' experience in software engineering or related area. 2 years' supervisory experience. Preferred Qualifications: Master's degree in computer science, computer engineering, computer information systems, software engineering, or related area and 3 years' experience in software engineering or related area. We value candidates with a background in creating inclusive digital experiences, demonstrating knowledge in implementing Web Content Accessibility Guidelines (WCAG) 2.2 AA standards, assistive technologies, and integrating digital accessibility seamlessly. Primary Location: 1395 Crossman Ave, Sunnyvale, CA 94089-1114, United States of America