Apple Inc.

Applied Scientist, AI Evaluation Platform

Apple Inc., Seattle, Washington, us, 98127

Applied Scientist, AI Evaluation Platform Seattle, Washington, United States Software and Services

Apple is where individual imaginations gather together, committing to the values that lead to great work. Every new product we build, service we create, or Apple Store experience we deliver is the result of us making each other’s ideas stronger. That happens because every one of us shares a belief that we can make something wonderful and share it with the world, changing lives for the better. It’s the diversity of our people and their thinking that inspires the innovation that runs through everything we do. When we bring everybody in, we can do the best work of our lives. Here, you’ll do more than join something — you’ll add something.

Description Our team, part of Apple Services Engineering, is looking for an Applied Scientist to lead the design and continuous development of automated benchmarking methodologies for AI-powered code assistant tools. In this role, you will investigate how coding-focused LLM agents behave, create rigorous evaluation frameworks, and establish scientific standards for assessing their quality and reliability. This role is crafted to enable the development of scalable evaluation frameworks that ensure our engineers have the right tools to create products that surprise and delight our customers. The successful candidate will have a proactive approach with the ability to work independently and collaboratively on a wide range of projects. In this role, you will work alongside a small but impactful team, collaborating with ML and data scientists, software developers, project managers and other teams at Apple to understand requirements and translate them into scalable, reliable, and efficient evaluation frameworks.

Responsibilities

Design scientifically grounded benchmarking methodologies for code assistants, covering multiple dimensions of quality (e.g. correctness, performance) across several use cases.

Developing automated evaluation pipelines that collect, automatically judge, and analyze model outputs at scale.

Create and curate datasets, tasks, and coding scenarios that represent realistic developer workflows across multiple languages and domains.

Define and validate new metrics for complex phenomena such as tool reliability, reasoning quality, or multi-turn developer interaction patterns.

Apply statistical rigor and reproducibility to above mentioned metrics.

Work closely with engineering and research teams to translate experimental findings into actionable model improvements.

Publish internal reports and external papers.

Monitor evolving industry practices and academic work to ensure benchmarks remain relevant.

Minimum Qualifications

Advanced degree (MS or PhD) in Computer Science, Software Engineering, or equivalent research/work experience.

Strong research background in empirical evaluation, experimental design, or benchmarking.

Strong proficiency in Python. Intermediate proficiency in Swift.

Deep familiarity with software engineering workflows and developer tools.

Experience working with or evaluating AI/ML models, preferably LLMs or program synthesis systems.

Strong analytical and communication skills, including the ability to write clear reports.

Preferred Qualifications

Publications in ML evaluation or related fields.

Experience with automated testing frameworks.

Experience constructing human-in-the-loop or multi-turn evaluation setups.

Prior work on agentic developer tools.

At Apple, base pay is one part of our total compensation package and is determined within a range. This provides the opportunity to progress as you grow and develop within a role. The base pay range for this role is between $139,500 and $210,100, and your base pay will depend on your skills, qualifications, experience, and location.

Apple employees also have the opportunity to become an Apple shareholder through participation in Apple’s discretionary employee stock programs. Apple employees are eligible for discretionary restricted stock unit awards, and can purchase Apple stock at a discount if voluntarily participating in Apple’s Employee Stock Purchase Plan. You’ll also receive benefits including: Comprehensive medical and dental coverage, retirement benefits, a range of discounted products and free services, and for formal education related to advancing your career at Apple, reimbursement for certain educational expenses — including tuition. Additionally, this role might be eligible for discretionary bonuses or commission payments as well as relocation. Learn more about Apple Benefits.

Note: Apple benefit, compensation and employee stock programs are subject to eligibility requirements and other terms of the applicable plan or program.

Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant .

Apple accepts applications to this posting on an ongoing basis.

#J-18808-Ljbffr