Logo
MuleSoft

SMTS Software Engineer - ML Infrastructure

MuleSoft, San Francisco, California, United States, 94199

Save Job

Einstein products & platform democratizes AI and transforms the way our Salesforce Ohana builds trusted machine learning and AI products - in days instead of months. It augments the Salesforce Platform with the ability to easily create, deploy, and manage Generative AI and Predictive AI applications across all clouds. We achieve this vision by providing unified, configuration-driven, and fully orchestrated machine learning APIs, customer-facing declarative interfaces and various microservices for the entire machine learning lifecycle including Data, Training, Predictions/scoring, Orchestration, Model Management, Model Storage, Experimentation etc. We are already producing over a billion predictions per day, training 1000s of models per day along with 10s of different Large Language models, serving thousands of customers. We are enabling customers' usage of leading large language models (LLMs), both internally and externally developed, so they can leverage it in their Salesforce use cases. Along with the power of Data Cloud, this platform provides customers an unparalleled advantage for quickly integrating AI in their applications and processes. What you’ll do:

Design and deliver scalable generative AI services that can be integrated with many applications, thousands of tenants, and run at scale in production.

Drive system efficiencies through automation, including capacity planning, configuration management, performance tuning, monitoring and root cause analysis.

Participate in periodic on-call rotations and be available for critical issues.

Partner with Product Managers, Application Architects, Data Scientists, and Deep Learning Researchers to understand customer requirements, design prototypes, and bring innovative technologies to production

Participate in meal conversations with your team members about really important topics, such as: Should the cuteness of panda bears be a factor in their survivability? Is love a decision tree or a regression model? How far ahead would society be today if we had 12 fingers instead of 10?

Required Skills:

6+ years of industry experience of ML engineering in building AI systems and/or distributed services.

Bachelors (or) Masters degree in Computer Science, Software Engineering, or related STEM field with strong competencies in algorithms, data structures and software design.

Experience building distributed microservice architecture on AWS, GCP or other public cloud substrates

Experience using modern containerized deployment stack using Kubernetes, Spinnaker, and other technologies

Proven ability to implement, operate, and deliver results via innovation at large scale

Strong programming expertise in JVM-based languages (Java, Scala) and Python.

Experience with distributed, scalable systems and modern data storage, messaging and processing frameworks, including Kafka, Spark, Docker, Hadoop, etc.

Grit, drive and a strong feeling of ownership coupled with collaboration and leadership.

Preferred Skills:

Understanding of MLOps/ML Infra workflows, processes and ML components

Strong experience building and applying machine learning models for business applications

Working or academic knowledge with Sagemaker, Tensorflow, Pytorch, Triton, Spark, or equivalent large-scale distributed Machine Learning technologies

Fantastic problem solver; ability to solve problems that the world has not solved before

Excellent written and spoken communication skills

Demonstrated track record of cultivating strong working relationships and driving collaboration across multiple technical and business teams

#J-18808-Ljbffr