Logo
LMArena

Machine Learning Engineer

LMArena, San Francisco, California, United States, 94199

Save Job

Join to apply for the

Machine Learning Engineer

role at

LMArena

Find out more about this role by reading the information below, then apply to be considered. About the Role LMArena is seeking a Senior Machine Learning Engineer to help scale and strengthen the core infrastructure that powers real-world AI evaluation. You’ll play a foundational role in shaping how we build, deploy, and improve our model benchmarking systems, working across data pipelines, inference APIs, and new evaluation methodologies. This is an opportunity to apply your technical expertise to a platform trusted by millions, and to help define how cutting-edge AI is assessed in the wild.

Location : SF Bay Area/Remote

Type : Full-Time

Responsibilities

Architect and build the core modeling for data and evaluation products

Own the full stack data, model training, and evaluation pipelines

Help grow a culture of feedback and rapid product iteration as we build new features

Conduct research into state-of-the‑art evaluation methods and contribute to the long‑term vision for a centralized, scalable evaluation platform

Who is LMArena? Created by researchers from UC Berkeley’s SkyLab, LMArena is an open platform where everyone can easily access, explore and interact with the world’s leading AI models. By comparing them side by side and casting votes for the better response, the community helps shape a public leaderboard, making AI progress more transparent and grounded in real‑world usage.

Why Join Us? Trusted by organizations like Google, OpenAI, Meta, xAI, and more, LMArena is rapidly becoming essential infrastructure for transparent, human‑centered AI evaluation at scale. With over one million monthly users and growing developer adoption, our impact is helping guide the next generation of safe, aligned AI systems—grounded in open access and collective feedback.

Requirements

Strong programming skills with the ability to work across the stack in a typical recommendation system or LLM stack

Experience in deep learning, language models, or reward model training

Experience with LLM fine‑tuning, prompt engineering, function calling, etc.

Self‑motivated with a willingness to take ownership of tasks

A passion for shipping quality products

4+ years of industry experience or relevant projects

Solid understanding of statistics and evaluation of uncertainty for product shipping

What we offer

Salary: 210k - 250k + equity, commensurate with experience and location

Competitive salary and meaningful equity

Comprehensive healthcare coverage (medical, dental, vision)

The opportunity to work on cutting‑edge AI with a small, mission‑driven team

A culture that values transparency, trust, and community impact

#J-18808-Ljbffr