84.51°

Lead AI/ML Engineer (P4368)

84.51°, Chicago, Illinois, United States, 60290

84.51° is a retail data science, insights and media company. We help The Kroger Co., consumer packaged goods companies, agencies, publishers and affiliates create more personalized and valuable experiences for shoppers across the path to purchase. Powered by cutting-edge science, we utilize first-party retail data from more than 62 million U.S. households sourced through the Kroger Plus loyalty card program to fuel a more customer-centric journey using 84.51° Insights, 84.51° Loyalty Marketing and our retail media advertising solution, Kroger Precision Marketing. Join us at 84.51°! __________________________________________________________ Cincinnati / Chicago SUMMARY The Lead AI/ML Engineer requires a unique mix of software engineering and AI skills necessary to create, deploy and maintain computationally efficient proprietary SLM, LLM, and embedding model implementations, serving infrastructure, and end-to-end solutions. This role has a specific focus on the models serving and operations within our foundation models team. A strong understanding of distributed systems, model serving architectures, GPU cluster management, and MLOps best practices that will scale across enterprise workloads and large-scale model deployments is critical to success. RESPONSIBILITIES Lead large-scale foundation model projects that can span months, focusing on model serving, inference optimization, and production deployment Foster a collaborative and innovative team environment, encouraging professional growth and development among junior team members in foundation model technologies Leverage known patterns, frameworks, and tools for automating & deploying foundation model serving solutions using Triton, vLLM, and other inference engines Develop new tools, processes and operational capabilities to monitor and analyze foundation model performance, latency, throughput, and resource utilization Work with researchers and ML engineers to optimize and scale foundation model serving using best practices in distributed systems, GPU orchestration, and MLOps Abstract foundation model serving solutions as robust APIs, microservices, or components that can be reused across the business with high availability and low latency Build, steward, and maintain production-grade foundation model serving infrastructure (robust, reliable, maintainable, observable, scalable, performant) to manage and serve LLMs, SLMs, and embedding models at scale Research state-of-the-art foundation model serving technologies, inference optimization techniques, and distributed GPU architectures to identify new opportunities for implementation across the enterprise Design and implement distributed GPU clusters for model training and inference workloads across GCP and Azure cloud environments Understand business requirements and trade-off latency, cost, throughput, and model accuracy to maximize value and translate research into production-ready serving solutions Reduce time to deployment, automate foundation model CI/CD pipelines, implement continuous monitoring of model serving metrics, and establish feedback loops for model performance Responsible for code reviews, infrastructure reviews, and production readiness assessments for foundation model deployments Apply appropriate documentation, version control, infrastructure as code practices, and other internal communication practices across channels Make time-sensitive decisions and solve urgent production issues in foundation model serving environments without escalation QUALIFICATIONS, SKILLS, AND EXPERIENCE Required: Bachelor's degree or higher in Machine Learning, Computer Science, Computer Engineering, Applied Statistics, or related field 5+ years of experience developing cloud-based software solutions with understanding of design for scalability, performance, and reliability in distributed systems 2+ years hands-on experience with foundation models (LLMs, SLMs, embedding models) in production environments; 2+ years of experience in model serving and inference optimization preferred Deep knowledge of foundation model serving frameworks, particularly Triton Inference Server and vLLM Working experience with PyTorch models and optimization for inference (quantization, pruning, ONNX, TensorRT) Knowledge of distributed GPU computing, CUDA programming, and GPU memory optimization techniques Hands-on experience with GCP and Azure cloud platforms, including GPU instances, managed services, and networking Experience with Databricks for large-scale data processing and model training workflows Knowledge of vector databases and embedding model serving Strong experience with open-source LLM fine-tuning frameworks (LoRA, QLoRA, full fine-tuning) Experience building large-scale model serving solutions that have been successfully delivered to production with enterprise SLAs Excellent communication skills, particularly on technical topics related to distributed systems and model serving architectures Kubernetes & Docker experience with focus on GPU workloads and model serving deployments CI/CD Pipeline experience with focus on ML model deployment; GitHub Actions experience preferred Terraform experience for infrastructure as code, particularly for GPU clusters and cloud ML infrastructure Strong skills in Python, with experience in async programming and high-performance computing API development experience with focus on high-throughput, low-latency model serving endpoints Experience with monitoring and observability tools for distributed systems (Prometheus, Grafana, DataDog, etc.) Knowledge of E2E Machine Learning pipeline and MLOps tools (model registry, experiment tracking, feature stores, model monitoring) in the context of foundation models Preferred: Experience with distributed training frameworks such as DeepSpeed, FSDP, FairScale Knowledge of model compression techniques and hardware acceleration Experience with multi-cloud deployments and hybrid cloud architectures Familiarity with emerging foundation model architectures and serving optimizations #LI-SSS Pay Transparency and Benefits The stated salary range represents the entire span applicable across all geographic markets from lowest to highest. Actual salary offers will be determined by multiple factors including but not limited to geographic location, relevant experience, knowledge, skills, other job-related qualifications, and alignment with market data and cost of labor. In addition to salary, this position is also eligible for variable compensation. Below is a list of some of the benefits we offer our associates: Health: Medical: with competitive plan designs and support for self-care, wellness and mental health. Dental: with in-network and out-of-network benefit. Vision: with in-network and out-of-network benefit. Wealth: 401(k) with Roth option and matching contribution. Health Savings Account with matching contribution (requires participation in qualifying medical plan). AD&D and supplemental insurance options to help ensure additional protection for you. Happiness: Hybrid work environment. Paid time off with flexibility to meet your life needs, including 5 weeks of vacation time, 7 health and wellness days, 3 floating holidays, as well as 6 company-paid holidays per year. Paid leave for maternity, paternity and family care instances. Pay Range $91,000 - $218,750 USD Create a Job Alert Interested in building your career at 84.51° ? Get future opportunities sent straight to your email. Apply for this job

* indicates a required field First Name * Last Name * Preferred First Name Email * Phone * Location (City) Resume/CV * Enter manually Accepted file types: pdf, doc, docx, txt, rtf Enter manually Accepted file types: pdf, doc, docx, txt, rtf Education School * Select... Degree * Select... Select... Select... Start date year End date month Select... End date year What is your current permanent address? * What state do you permanently reside in? * Select... Which work location would you prefer? * Chicago, IL New York, NY Portland, OR Location options vary by role and are stated in each job posting. Our Headquarters are in Cincinnati, OH. We also have hubs in Chicago, Deerfield, New York and Portland. Not all roles are open in all locations. We have an in-office expectation for Monday-Thursday with the option to work remote on Fridays (may vary by role) and candidates must live in one of our hub locations or be willing to relocate (relocation assistance is provided). Please select one of the following that applies: * Select... How did you first learn about 84.51° and this position? * Why are you interested in 84.51° and this position? * What knowledge and skills do you bring to 84.51° to enable you for success? * Who is your current employer? * Have you ever been employed by 84.51°? * Select... Are you or have you ever been employed by dunnhumby? * Select... Are you or have you ever been employed by Kroger Co.? * Select... If your current employer is the Kroger Family of Companies, please list what department you are currently in, who your manager is, and state whether you've informed your manager of your application or not.Please input N/A if you do not currently work for Kroger. Are you bound by any commitments, contracts or agreements with your current or former employer(s) that might affect your employment with 84.51°? * Select... Are you legally authorized to work in the U.S.? Proof of eligibility will be required before you can be employed. * Select... Will you now or in the future require sponsorship for employment visa status (e.g. H1B status)? * Select... If you currently hold a visa, what type do you hold? * Select... Individuals who have questions about this Applicant Notice should contact 84.51° by using the Contact Us link at the bottom of our homepage atwww.8451.com . The essential functions of this position have been listed in the job posting. Can you perform the essential functions of the position for which you are applying as they are listed in the job posting with or without reasonable accommodation? * Select... 84.51° Demographic Questions

Together, we are stronger and can achieve more

At 84.51°, we believe a diverse and inclusive work environment is essential to the work we do as a data science company. Just as no two Kroger customers are alike, no two 84.51° associates are alike. We understand the importance of fostering an inclusive culture: to encourage our associates to bring their authentic selves to work – embracing who they are and celebrating what they can become.

We continually strive to ensure 84.51° is a place where all people feel like they belong, are respected and valued regardless of who they are, where they are from and what experiences they’ve had. By meeting our 3-year D&I roadmap goals and commitments, we will continue our journey towards becoming a destination for diverse, driven, and authentic minds.

Your responses will be used (in aggregate only) to help us identifyareas of improvement in our process.

Your responses will not be associated with your specific application and will not in any way be used in the hiring decision.

Which ethnicities describe you? Select all that apply. * Select... How do you currently describe your gender identity? * Select... Do you consider yourself a member of the Lesbian, Gay or Bisexual (LGB) community? * Select... Do you identify as a military veteran or service member? * Select...

#J-18808-Ljbffr