techire ai
Join
techire ai
as the Head of Research – Post-Training & Reinforcement Learning. We’re looking for a Head of Research who will lead cutting‑edge work in post‑training and reinforcement learning to ensure AI models remain aligned, reliable, and safe. Key Responsibilities
Guide a team of applied ML and research experts from leading AI labs. Design and conduct experiments in RLHF, DPO, GRPO, and reward modeling. Develop complex RL environments to stress‑test reasoning, planning, and long‑horizon behavior. Translate research into production systems used by leading AI labs. Publish papers at top venues (NeurIPS, ICLR, ACL, EMNLP) and release open‑source tools. Shape technical vision and influence AI alignment standards and policies. Required Qualifications
Deep research experience in post‑training or reinforcement learning methods (RLHF, DPO, GRPO, reward modeling). Strong background in training and evaluating large language models. Proven publication record at top‑tier venues (NeurIPS, ICLR, ICML, ACL, EMNLP). Experience leading research teams and scoping high‑impact projects. Curiosity, creativity, and ability to thrive in a fast‑moving startup environment. Compensation & Benefits
$300k–$400k base salary + significant equity. Full benefits include health, dental, vision, 401(k), unlimited PTO, and global off‑sites. Relocation support available for San Francisco, with flexibility for exceptional candidates. Employment Information
Seniority level – Mid‑Senior level Employment type – Full‑time Job function – Engineering, Information Technology & Science Industries – Software Development, IT Services & Consulting, Research Services All applicants will receive a response.
#J-18808-Ljbffr
techire ai
as the Head of Research – Post-Training & Reinforcement Learning. We’re looking for a Head of Research who will lead cutting‑edge work in post‑training and reinforcement learning to ensure AI models remain aligned, reliable, and safe. Key Responsibilities
Guide a team of applied ML and research experts from leading AI labs. Design and conduct experiments in RLHF, DPO, GRPO, and reward modeling. Develop complex RL environments to stress‑test reasoning, planning, and long‑horizon behavior. Translate research into production systems used by leading AI labs. Publish papers at top venues (NeurIPS, ICLR, ACL, EMNLP) and release open‑source tools. Shape technical vision and influence AI alignment standards and policies. Required Qualifications
Deep research experience in post‑training or reinforcement learning methods (RLHF, DPO, GRPO, reward modeling). Strong background in training and evaluating large language models. Proven publication record at top‑tier venues (NeurIPS, ICLR, ICML, ACL, EMNLP). Experience leading research teams and scoping high‑impact projects. Curiosity, creativity, and ability to thrive in a fast‑moving startup environment. Compensation & Benefits
$300k–$400k base salary + significant equity. Full benefits include health, dental, vision, 401(k), unlimited PTO, and global off‑sites. Relocation support available for San Francisco, with flexibility for exceptional candidates. Employment Information
Seniority level – Mid‑Senior level Employment type – Full‑time Job function – Engineering, Information Technology & Science Industries – Software Development, IT Services & Consulting, Research Services All applicants will receive a response.
#J-18808-Ljbffr