Reflection AI
Role Overview
Data quality and diversity is the foundation for training the best agents in any domain. As a member of the Data Team at Reflection, you will play a pivotal role in shaping how we collect and analyze human, synthetic, and internet data. This is an interdisciplinary role that primarily requires engineering, research, and communication skills, along with a sharp attention to detail and willingness to "roll up your sleeves" and look at the data. Key Responsibilities
1. Experiment and Benchmark Design Develop techniques for collecting, augmenting, filtering, or synthesizing training and evaluation data using creativity and analytical thinking Design experiments, in collaboration with machine learning researchers, to assess the impact of different datasets on model performance When required, manage human annotators working on data collection efforts
this could include tracking payments and hours, training annotators, and providing technical support, feedback, and quality control 2. Qualitative and Quantitative Data Analysis Analyze collected data, e.g. coding tasks, both qualitatively and quantitatively Evaluate model behavior to identify its strengths and weaknesses Clearly communicate findings with machine learning research and product teams 3. Data Engineering Design, implement, and optimize scalable data pipelines to support reinforcement learning and supervised finetuning Leverage LLMs to perform data filtering, cleaning, and augmentation Qualifications
Software engineering background with experience building data processing pipelines at scale, particularly with LLM integration Proficiency in Python or other programming languages (Go, TypeScript, etc.) Detail-oriented and analytical, with the ability to conduct careful qualitative and quantitative data analysis Excellent organizational and communication skills to collaborate closely with cross-functional teams and manage human data operations Experience with machine learning, reinforcement learning, and LLMs is a plus, but not strictly required. What We Offer
The opportunity to work at the forefront of AI research and data collection for training cutting-edge models. Collaboration with a team of world-class researchers and engineers from top AI labs and companies. Competitive compensation and benefits, with opportunities for professional growth.
Data quality and diversity is the foundation for training the best agents in any domain. As a member of the Data Team at Reflection, you will play a pivotal role in shaping how we collect and analyze human, synthetic, and internet data. This is an interdisciplinary role that primarily requires engineering, research, and communication skills, along with a sharp attention to detail and willingness to "roll up your sleeves" and look at the data. Key Responsibilities
1. Experiment and Benchmark Design Develop techniques for collecting, augmenting, filtering, or synthesizing training and evaluation data using creativity and analytical thinking Design experiments, in collaboration with machine learning researchers, to assess the impact of different datasets on model performance When required, manage human annotators working on data collection efforts
this could include tracking payments and hours, training annotators, and providing technical support, feedback, and quality control 2. Qualitative and Quantitative Data Analysis Analyze collected data, e.g. coding tasks, both qualitatively and quantitatively Evaluate model behavior to identify its strengths and weaknesses Clearly communicate findings with machine learning research and product teams 3. Data Engineering Design, implement, and optimize scalable data pipelines to support reinforcement learning and supervised finetuning Leverage LLMs to perform data filtering, cleaning, and augmentation Qualifications
Software engineering background with experience building data processing pipelines at scale, particularly with LLM integration Proficiency in Python or other programming languages (Go, TypeScript, etc.) Detail-oriented and analytical, with the ability to conduct careful qualitative and quantitative data analysis Excellent organizational and communication skills to collaborate closely with cross-functional teams and manage human data operations Experience with machine learning, reinforcement learning, and LLMs is a plus, but not strictly required. What We Offer
The opportunity to work at the forefront of AI research and data collection for training cutting-edge models. Collaboration with a team of world-class researchers and engineers from top AI labs and companies. Competitive compensation and benefits, with opportunities for professional growth.