NVIDIA
Deep Learning Scientist, LLM Training Datasets
NVIDIA, Myrtle Point, Oregon, United States, 97458
Employer Industry: Technology (Deep Learning and Artificial Intelligence)
Overview Employer Industry: Technology (Deep Learning and Artificial Intelligence)
What to Expect
Develop datasets for LLM pre-training and post-training, optimizing models and evaluating performance
Design and implement data strategies for model training and evaluation, including data collection and management
Generate high-quality synthetic data to augment existing datasets for various use cases
Define data annotation guidelines and curate labeled datasets for model alignment, including reinforcement learning
Conduct experiments to optimize Large Language Models using various techniques
Responsibilities
Develop datasets for LLM pre-training and post-training, optimizing models and evaluating performance
Design and implement data strategies for model training and evaluation, including data collection and management
Generate high-quality synthetic data to augment existing datasets for various use cases
Define data annotation guidelines and curate labeled datasets for model alignment, including reinforcement learning
Conduct experiments to optimize Large Language Models using various techniques
What is Required (Qualifications)
Master’s or PhD in Computer Science, Electrical Engineering, or a related field, or equivalent experience
3+ years of experience in developing datasets and training large language models or generative AI models
Hands-on programming expertise in Python
Solid understanding of machine learning concepts and algorithms related to data management
Experience with synthetic data generation techniques and evaluation strategies
How to Stand Out (Preferred Qualifications)
Strong track record of contributions to open-source data tools or research publications
Experience with cloud platforms (e.g., AWS, GCP, Azure) and data storage systems (e.g., S3, Google Cloud Storage)
Continuous evaluation of new tools, techniques, and methodologies in data engineering and generative AI
Passion for AI with prior scientific research and publication experience
Benefits / Compensation
Base salary range of $148,000 - $287,500, depending on level and experience
Eligibility for equity and a comprehensive benefits package
Opportunity for career advancement in cutting-edge fields such as Deep Learning and AI
Work alongside some of the most forward-thinking professionals in the industry
Engage in innovative research and contribute to impactful projects in generative AI
Be part of a diverse and inclusive work environment that values all employees
#DeepLearning #ArtificialIntelligence #DataScience #MachineLearning #CareerOpportunity
We prioritize candidate privacy and champion equal-opportunity employment. Central to our mission is our partnership with companies that share this commitment. We aim to foster a fair, transparent, and secure hiring environment for all. If you encounter any employer not adhering to these principles, please bring it to our attention immediately. We are not the EOR (Employer of Record) for this position. Our role in this specific opportunity is to connect outstanding candidates with a top-tier employer.
#J-18808-Ljbffr
Overview Employer Industry: Technology (Deep Learning and Artificial Intelligence)
What to Expect
Develop datasets for LLM pre-training and post-training, optimizing models and evaluating performance
Design and implement data strategies for model training and evaluation, including data collection and management
Generate high-quality synthetic data to augment existing datasets for various use cases
Define data annotation guidelines and curate labeled datasets for model alignment, including reinforcement learning
Conduct experiments to optimize Large Language Models using various techniques
Responsibilities
Develop datasets for LLM pre-training and post-training, optimizing models and evaluating performance
Design and implement data strategies for model training and evaluation, including data collection and management
Generate high-quality synthetic data to augment existing datasets for various use cases
Define data annotation guidelines and curate labeled datasets for model alignment, including reinforcement learning
Conduct experiments to optimize Large Language Models using various techniques
What is Required (Qualifications)
Master’s or PhD in Computer Science, Electrical Engineering, or a related field, or equivalent experience
3+ years of experience in developing datasets and training large language models or generative AI models
Hands-on programming expertise in Python
Solid understanding of machine learning concepts and algorithms related to data management
Experience with synthetic data generation techniques and evaluation strategies
How to Stand Out (Preferred Qualifications)
Strong track record of contributions to open-source data tools or research publications
Experience with cloud platforms (e.g., AWS, GCP, Azure) and data storage systems (e.g., S3, Google Cloud Storage)
Continuous evaluation of new tools, techniques, and methodologies in data engineering and generative AI
Passion for AI with prior scientific research and publication experience
Benefits / Compensation
Base salary range of $148,000 - $287,500, depending on level and experience
Eligibility for equity and a comprehensive benefits package
Opportunity for career advancement in cutting-edge fields such as Deep Learning and AI
Work alongside some of the most forward-thinking professionals in the industry
Engage in innovative research and contribute to impactful projects in generative AI
Be part of a diverse and inclusive work environment that values all employees
#DeepLearning #ArtificialIntelligence #DataScience #MachineLearning #CareerOpportunity
We prioritize candidate privacy and champion equal-opportunity employment. Central to our mission is our partnership with companies that share this commitment. We aim to foster a fair, transparent, and secure hiring environment for all. If you encounter any employer not adhering to these principles, please bring it to our attention immediately. We are not the EOR (Employer of Record) for this position. Our role in this specific opportunity is to connect outstanding candidates with a top-tier employer.
#J-18808-Ljbffr