iTCO Solutions
100% Remote
Job Title:
Senior AI/ML Engineer - Large Language Model Pretraining (100B+ Parameters)
Location -
West Coast 100% Remote
Role Overview We are seeking
Senior AI/ML Engineers
with
PhDs or Master's degrees
in Computer Science or related fields from
top 20 universities . You will lead the pretraining of
massive LLMs (100B+ parameters) , requiring deep expertise in distributed training, large-scale optimization, and model architecture. This is a rare opportunity to work with petabyte-scale datasets and cutting-edge compute clusters in a high-impact environment.
Key Responsibilities
Architect and implement large-scale training pipelines
for LLMs with 100B+ parameters. Optimize distributed training performance across thousands of GPUs/TPUs. Collaborate with research scientists to translate experimental results into production-grade training runs. Manage and preprocess petabyte-scale datasets for pretraining. Implement state-of-the-art techniques in scaling laws, model parallelism, and memory optimization. Conduct rigorous benchmarking, profiling, and performance tuning. Contribute to Client research in LLM architecture, training stability, and efficiency. Required Qualifications
Advanced degree
(PhD or Master's) in Computer Science, Machine Learning, or related field from a
top 20 global university
in CS. 3+ years
of hands-on experience with large-scale deep learning model training. Proven experience in
pretraining models exceeding 10B parameters , preferably 100B+. Deep expertise in distributed training frameworks ( DeepSpeed, Megatron-LM, PyTorch FSDP, TensorFlow Mesh, JAX/TPU ). Proficiency with
parallelism strategies
(data, tensor, pipeline) and
mixed precision training . Experience with large-scale cloud or HPC environments ( AWS, Azure, GCP, Slurm, Kubernetes, Ray ). Strong skills in
Python ,
CUDA , and performance optimization. Strong publication record in top-tier ML/AI venues (NeurIPS, ICML, ICLR, ACL, etc.) preferred. Preferred Skills
Experience with
LLM fine-tuning
(RLHF, LoRA, PEFT). Familiarity with
tokenizer development
and multilingual pretraining. Knowledge of
scaling laws
and
model evaluation frameworks
for massive LLMs. Hands-on work with
petabyte-scale distributed storage systems .
Verify:
United States Employment Opportunities Only
E-Verify is an internet-based system operated by the Department of Homeland Security and the Social Security Administration and allows employers to confirm an individual's employment eligibility to work in the United States. Under the E-Verify rules, effective September 8, 2009, federal agencies subject to the Federal Acquisition Regulation are required to modify, and include in new contracts, a provision that requires federal contractors and subcontractors to use E-Verify. ITCO Solutions is required to adhere to these requirements.
This message is intended for the use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.
Job Title:
Senior AI/ML Engineer - Large Language Model Pretraining (100B+ Parameters)
Location -
West Coast 100% Remote
Role Overview We are seeking
Senior AI/ML Engineers
with
PhDs or Master's degrees
in Computer Science or related fields from
top 20 universities . You will lead the pretraining of
massive LLMs (100B+ parameters) , requiring deep expertise in distributed training, large-scale optimization, and model architecture. This is a rare opportunity to work with petabyte-scale datasets and cutting-edge compute clusters in a high-impact environment.
Key Responsibilities
Architect and implement large-scale training pipelines
for LLMs with 100B+ parameters. Optimize distributed training performance across thousands of GPUs/TPUs. Collaborate with research scientists to translate experimental results into production-grade training runs. Manage and preprocess petabyte-scale datasets for pretraining. Implement state-of-the-art techniques in scaling laws, model parallelism, and memory optimization. Conduct rigorous benchmarking, profiling, and performance tuning. Contribute to Client research in LLM architecture, training stability, and efficiency. Required Qualifications
Advanced degree
(PhD or Master's) in Computer Science, Machine Learning, or related field from a
top 20 global university
in CS. 3+ years
of hands-on experience with large-scale deep learning model training. Proven experience in
pretraining models exceeding 10B parameters , preferably 100B+. Deep expertise in distributed training frameworks ( DeepSpeed, Megatron-LM, PyTorch FSDP, TensorFlow Mesh, JAX/TPU ). Proficiency with
parallelism strategies
(data, tensor, pipeline) and
mixed precision training . Experience with large-scale cloud or HPC environments ( AWS, Azure, GCP, Slurm, Kubernetes, Ray ). Strong skills in
Python ,
CUDA , and performance optimization. Strong publication record in top-tier ML/AI venues (NeurIPS, ICML, ICLR, ACL, etc.) preferred. Preferred Skills
Experience with
LLM fine-tuning
(RLHF, LoRA, PEFT). Familiarity with
tokenizer development
and multilingual pretraining. Knowledge of
scaling laws
and
model evaluation frameworks
for massive LLMs. Hands-on work with
petabyte-scale distributed storage systems .
Verify:
United States Employment Opportunities Only
E-Verify is an internet-based system operated by the Department of Homeland Security and the Social Security Administration and allows employers to confirm an individual's employment eligibility to work in the United States. Under the E-Verify rules, effective September 8, 2009, federal agencies subject to the Federal Acquisition Regulation are required to modify, and include in new contracts, a provision that requires federal contractors and subcontractors to use E-Verify. ITCO Solutions is required to adhere to these requirements.
This message is intended for the use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.