NVIDIA
Senior Deep Learning Software Engineer, PyTorch - TensorRT Performance
NVIDIA, Myrtle Point, Oregon, United States, 97458
Employer Industry: Technology - Deep Learning and AI Solutions
Why consider this job opportunity
Salary up to $287,500
Opportunity for equity and comprehensive benefits
Engage in groundbreaking work within the rapidly evolving field of deep learning and AI
Collaborate with diverse teams across various innovative applications, including generative AI and robotics
Contribute to the development of performance optimization solutions that empower advanced AI technologies
Work in a hybrid environment that supports flexibility and collaboration
What to Expect (Job Responsibilities)
Analyze performance issues and identify optimization opportunities within Torch-TensorRT/TensorRT
Contribute features and code to NVIDIA/OSS inference frameworks, including Torch-TensorRT/TensorRT/PyTorch
Collaborate with cross-functional teams to develop innovative inference solutions across multiple domains
Scale the performance of deep learning models across different architectures and types of NVIDIA accelerators
Implement graph compiler algorithms, frontend operators, and code generators within the software stack
What is Required (Qualifications)
Bachelor's, Master's, PhD, or equivalent experience in relevant fields (Computer Science, Computer Engineering, EECS, AI)
Minimum of 4 years of relevant software development experience
Excellent Python/C++ programming, software design, and software engineering skills
Experience with a deep learning framework such as PyTorch, JAX, or TensorFlow
Experience in performance analysis and optimization
How to Stand Out (Preferred Qualifications)
Architectural knowledge of GPU technology
Prior experience with AoT or JiT compilers in deep learning inference, e.g., TorchDynamo/TorchInductor
Experience with performance modeling, profiling, debugging, and code optimization for DL/HPC applications
Proficiency in GPU programming and domain-specific languages such as CUDA/TileIR/CuTeDSL/cutlass/Triton
We prioritize candidate privacy and champion equal-opportunity employment. Central to our mission is our partnership with companies that share this commitment. We aim to foster a fair, transparent, and secure hiring environment for all. If you encounter any employer not adhering to these principles, please bring it to our attention immediately.
We are not the Employer of Record (EOR) for this position. Our role in this specific opportunity is to connect outstanding candidates with a top-tier employer.
#J-18808-Ljbffr
Why consider this job opportunity
Salary up to $287,500
Opportunity for equity and comprehensive benefits
Engage in groundbreaking work within the rapidly evolving field of deep learning and AI
Collaborate with diverse teams across various innovative applications, including generative AI and robotics
Contribute to the development of performance optimization solutions that empower advanced AI technologies
Work in a hybrid environment that supports flexibility and collaboration
What to Expect (Job Responsibilities)
Analyze performance issues and identify optimization opportunities within Torch-TensorRT/TensorRT
Contribute features and code to NVIDIA/OSS inference frameworks, including Torch-TensorRT/TensorRT/PyTorch
Collaborate with cross-functional teams to develop innovative inference solutions across multiple domains
Scale the performance of deep learning models across different architectures and types of NVIDIA accelerators
Implement graph compiler algorithms, frontend operators, and code generators within the software stack
What is Required (Qualifications)
Bachelor's, Master's, PhD, or equivalent experience in relevant fields (Computer Science, Computer Engineering, EECS, AI)
Minimum of 4 years of relevant software development experience
Excellent Python/C++ programming, software design, and software engineering skills
Experience with a deep learning framework such as PyTorch, JAX, or TensorFlow
Experience in performance analysis and optimization
How to Stand Out (Preferred Qualifications)
Architectural knowledge of GPU technology
Prior experience with AoT or JiT compilers in deep learning inference, e.g., TorchDynamo/TorchInductor
Experience with performance modeling, profiling, debugging, and code optimization for DL/HPC applications
Proficiency in GPU programming and domain-specific languages such as CUDA/TileIR/CuTeDSL/cutlass/Triton
We prioritize candidate privacy and champion equal-opportunity employment. Central to our mission is our partnership with companies that share this commitment. We aim to foster a fair, transparent, and secure hiring environment for all. If you encounter any employer not adhering to these principles, please bring it to our attention immediately.
We are not the Employer of Record (EOR) for this position. Our role in this specific opportunity is to connect outstanding candidates with a top-tier employer.
#J-18808-Ljbffr