Handshake
Overview
Join to apply for the
AI Research Scientist - Evaluation, Handshake AI
role at
Handshake . Handshake AI is a human data labeling business that leverages the scale of the largest early career network. We work directly with the worlds leading AI research labs to build a new generation of human data products. From PhDs in physics to undergrads fluent in LLMs, Handshake AI is the trusted partner for domain-specific data and evaluation at scale. This is a unique opportunity to join a fast-growing team shaping the future of AI through better data, better tools, and better systemsfor experts, by experts. Nows a great time to join Handshake. About Handshake AI
Handshake is building the career network for the AI economy. Our three-sided marketplace connects 18 million students and alumni, 1,500+ academic institutions across the U.S. and Europe, and 1 million employers to power how the next generation explores careers, builds skills, and gets hired. About The Role Design and conduct original research in LLM understanding, evaluation methodologies, and the dynamics of human-AI knowledge interaction Develop novel evaluation frameworks and assessment techniques that reveal deep insights into model capabilities and limitations Collaborate with engineers to transform research breakthroughs into scalable benchmarks and evaluation systems Pioneer new approaches to measuring model understanding, reasoning capabilities, and alignment with human knowledge Write high-quality code to support large-scale experimentation, evaluation, and knowledge assessment workflows Publish findings in top-tier conferences and contribute to advancing the fields understanding of AI capabilities Work with cross-functional teams to establish new standards for responsible AI evaluation and knowledge alignment
Desired Capabilities
PhD or equivalent research experience in machine learning, computer science, cognitive science, or a related field with focus on AI evaluation or understanding Strong background in LLM research, model evaluation methodologies, interpretability, or foundational AI assessment techniques Demonstrated ability to independently lead post training and evaluation research projects from theoretical framework to empirical validation Proficiency in Python and deep experience with PyTorch for large-scale model analysis and evaluation Experience designing and conducting experiments with large language models, benchmark development, or systematic model assessment Strong publication record in post training, AI evaluation, model understanding, interpretability, or related areas that advance our comprehension of AI capabilities Ability to clearly communicate complex insights about model behavior, evaluation methodologies, and their implications for AI development
Extra Credit
Experience with RL, agent modeling, or AI alignment Familiarity with data-centric AI approaches, synthetic data generation, or human-in-the-loop systems Understanding of the challenges in scaling foundation models (e.g., training stability, safety, inference efficiency) Contributions to open-source AI libraries or research tooling Interest in shaping the societal impact, deployment ethics, and governance of frontier models
Perks
Handshake delivers benefits that help you feel supportedand thrive at work and in life. The below benefits are for full-time US employees. Ownership:
Equity in a fast-growing company Financial Wellness : 401(k) match, competitive compensation, financial coaching Family Support:
Paid parental leave, fertility benefits, parental coaching Wellbeing:
Medical, dental, and vision, mental health support, $500 wellness stipend Growth:
$2,000 learning stipend, ongoing development Remote & Office:
Stipends for home office setup, internet, commuting, and free lunch/gym in our SF office Time Off:
Flexible PTO, 15 holidays + 2 flex days, winter #ShakeBreak where our whole office closes for a week Connection:
Team outings & referral bonuses
Explore our mission, values, and comprehensive US benefits at joinhandshake.com/careers. Seniority level
Mid-Senior level
Employment type
Full-time
Job function
Engineering and Information Technology
Industries
Software Development
Referrals increase your chances of interviewing at Handshake by 2x Get notified about new Artificial Intelligence Researcher jobs in
New York, NY . #J-18808-Ljbffr
Join to apply for the
AI Research Scientist - Evaluation, Handshake AI
role at
Handshake . Handshake AI is a human data labeling business that leverages the scale of the largest early career network. We work directly with the worlds leading AI research labs to build a new generation of human data products. From PhDs in physics to undergrads fluent in LLMs, Handshake AI is the trusted partner for domain-specific data and evaluation at scale. This is a unique opportunity to join a fast-growing team shaping the future of AI through better data, better tools, and better systemsfor experts, by experts. Nows a great time to join Handshake. About Handshake AI
Handshake is building the career network for the AI economy. Our three-sided marketplace connects 18 million students and alumni, 1,500+ academic institutions across the U.S. and Europe, and 1 million employers to power how the next generation explores careers, builds skills, and gets hired. About The Role Design and conduct original research in LLM understanding, evaluation methodologies, and the dynamics of human-AI knowledge interaction Develop novel evaluation frameworks and assessment techniques that reveal deep insights into model capabilities and limitations Collaborate with engineers to transform research breakthroughs into scalable benchmarks and evaluation systems Pioneer new approaches to measuring model understanding, reasoning capabilities, and alignment with human knowledge Write high-quality code to support large-scale experimentation, evaluation, and knowledge assessment workflows Publish findings in top-tier conferences and contribute to advancing the fields understanding of AI capabilities Work with cross-functional teams to establish new standards for responsible AI evaluation and knowledge alignment
Desired Capabilities
PhD or equivalent research experience in machine learning, computer science, cognitive science, or a related field with focus on AI evaluation or understanding Strong background in LLM research, model evaluation methodologies, interpretability, or foundational AI assessment techniques Demonstrated ability to independently lead post training and evaluation research projects from theoretical framework to empirical validation Proficiency in Python and deep experience with PyTorch for large-scale model analysis and evaluation Experience designing and conducting experiments with large language models, benchmark development, or systematic model assessment Strong publication record in post training, AI evaluation, model understanding, interpretability, or related areas that advance our comprehension of AI capabilities Ability to clearly communicate complex insights about model behavior, evaluation methodologies, and their implications for AI development
Extra Credit
Experience with RL, agent modeling, or AI alignment Familiarity with data-centric AI approaches, synthetic data generation, or human-in-the-loop systems Understanding of the challenges in scaling foundation models (e.g., training stability, safety, inference efficiency) Contributions to open-source AI libraries or research tooling Interest in shaping the societal impact, deployment ethics, and governance of frontier models
Perks
Handshake delivers benefits that help you feel supportedand thrive at work and in life. The below benefits are for full-time US employees. Ownership:
Equity in a fast-growing company Financial Wellness : 401(k) match, competitive compensation, financial coaching Family Support:
Paid parental leave, fertility benefits, parental coaching Wellbeing:
Medical, dental, and vision, mental health support, $500 wellness stipend Growth:
$2,000 learning stipend, ongoing development Remote & Office:
Stipends for home office setup, internet, commuting, and free lunch/gym in our SF office Time Off:
Flexible PTO, 15 holidays + 2 flex days, winter #ShakeBreak where our whole office closes for a week Connection:
Team outings & referral bonuses
Explore our mission, values, and comprehensive US benefits at joinhandshake.com/careers. Seniority level
Mid-Senior level
Employment type
Full-time
Job function
Engineering and Information Technology
Industries
Software Development
Referrals increase your chances of interviewing at Handshake by 2x Get notified about new Artificial Intelligence Researcher jobs in
New York, NY . #J-18808-Ljbffr