Oracle
Job Description
Oracle Cloud Infrastructure (OCI) Cluster Networking team is building an ultra‑high‑performance network to support AI/ML/HPC workloads. Join us to design systems that scale from tens to hundreds of thousands of GPUs without sacrificing performance. Our team develops and tunes the software and hardware stack for distributed workloads using libraries such as NCCL on high‑speed networks.
Strong knowledge and practical experience with NCCL is essential for this role. You’ll apply collective communication libraries to tune system performance at a previously unheard‑of scale—our approach to scaling is cutting edge.
Key Experience
7+ years of experience with software (systems/application) development
2+ years of experience with collective communications libraries like NCCL, RCCL, MPI and GPU frameworks like CUDA and ROCm.
2+ years of experience with ML training frameworks like PyTorch, TensorFlow
Proficient programming in any two out of C/C++, Python, Java, Scala, GO
Proficient with data structures, algorithms, operating systems
Excellent organizational, verbal, and written communication skills
Bachelor’s in Computer Science and Engineering or related engineering fields
Preferred Qualifications
Master's / PhD degree in Computer Science or related engineering fields
Experience with RDMA programming, including but not limited to GPUDirect RDMA
Experience with distributed workload managers like Slurm or K8s
Experience with Linux performance tools
Experience in SDN, NFV, Cloud Networking
Experience in Infrastructure-as-a-Service (OpenStack, AWS, GCP, Azure)
Salary & Benefits US: Hiring Range in USD from: $96,800 - $223,400 per year. May be eligible for bonus and equity.
Oracle offers a comprehensive benefits package:
Medical, dental, and vision insurance
Short‑term and long‑term disability
Life insurance and AD&D
Supplemental life insurance
Health care and dependent care flexible spending accounts
Pre‑tax commuter and parking benefits
401(k) savings and investment plan with company match
Paid time off and vacation accrual
Paid sick leave
Paid parental leave
Adoption assistance
Employee Stock Purchase Plan
Financial planning and group legal services
Voluntary benefits (auto, homeowner, pet insurance)
Disclaimer Some roles may require immunization and occupational health mandates. Certain US customer or client‑facing roles may require compliance with applicable requirements.
Range and benefit information provided in this posting are specific to the stated locations only. Oracle maintains broad salary ranges for its roles in order to account for variations in knowledge, skills, experience, market conditions and locations.
EEO Statement Oracle is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veteran status, or any other characteristic protected by law. Oracle will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.
#J-18808-Ljbffr
Strong knowledge and practical experience with NCCL is essential for this role. You’ll apply collective communication libraries to tune system performance at a previously unheard‑of scale—our approach to scaling is cutting edge.
Key Experience
7+ years of experience with software (systems/application) development
2+ years of experience with collective communications libraries like NCCL, RCCL, MPI and GPU frameworks like CUDA and ROCm.
2+ years of experience with ML training frameworks like PyTorch, TensorFlow
Proficient programming in any two out of C/C++, Python, Java, Scala, GO
Proficient with data structures, algorithms, operating systems
Excellent organizational, verbal, and written communication skills
Bachelor’s in Computer Science and Engineering or related engineering fields
Preferred Qualifications
Master's / PhD degree in Computer Science or related engineering fields
Experience with RDMA programming, including but not limited to GPUDirect RDMA
Experience with distributed workload managers like Slurm or K8s
Experience with Linux performance tools
Experience in SDN, NFV, Cloud Networking
Experience in Infrastructure-as-a-Service (OpenStack, AWS, GCP, Azure)
Salary & Benefits US: Hiring Range in USD from: $96,800 - $223,400 per year. May be eligible for bonus and equity.
Oracle offers a comprehensive benefits package:
Medical, dental, and vision insurance
Short‑term and long‑term disability
Life insurance and AD&D
Supplemental life insurance
Health care and dependent care flexible spending accounts
Pre‑tax commuter and parking benefits
401(k) savings and investment plan with company match
Paid time off and vacation accrual
Paid sick leave
Paid parental leave
Adoption assistance
Employee Stock Purchase Plan
Financial planning and group legal services
Voluntary benefits (auto, homeowner, pet insurance)
Disclaimer Some roles may require immunization and occupational health mandates. Certain US customer or client‑facing roles may require compliance with applicable requirements.
Range and benefit information provided in this posting are specific to the stated locations only. Oracle maintains broad salary ranges for its roles in order to account for variations in knowledge, skills, experience, market conditions and locations.
EEO Statement Oracle is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veteran status, or any other characteristic protected by law. Oracle will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.
#J-18808-Ljbffr