Logo
Oracle

Senior Principal Software Engineer - AI Infrastructure Innovation

Oracle, Nashville, Tennessee, United States, 37247

Save Job

Overview

Oracle Cloud Infrastructure’s (OCI) architecture development engineering team is seeking a highly driven GPU platform software & system development engineer at the Principal Engineer level for AI Infrastructure Innovation. We are at the forefront of AI innovation, exploring the next generation of AI accelerators and hardware solutions. Job level: Career Level - IC5. Responsibilities

Evaluation of system architecture and proposed implementation path analysis. Work directly with hardware design and development teams on architecture, implementation, development, deployment, and troubleshooting of AI hardware platforms. Collaboration with the wider Oracle engineering and operations teams and external partners is expected. Conduct comprehensive benchmarking and performance analysis of AI accelerators from emerging hardware vendors (e.g., SambaNova, Groq). Compare and contrast new AI accelerators with industry-standard hardware (e.g., NVIDIA GPUs) for training and inference workloads. Develop tools and processes for evaluating the performance of hardware in real-world AI applications. Contribute to the design and improvement of performance optimization algorithms for AI models running on the hardware. Basic Qualifications

BS or MS degree in Computer Science or a relevant technical field or equivalent practical experience. 10+ years of total experience in software development. Demonstrated ability to write code using Java, GoLang, C#, or similar OO languages. Solid knowledge of AI / GPU platform architecture and their capabilities. Experience working on large-scale, highly distributed services infrastructure. Solid working experience with GPU supplier test code and open-source AI test / characterization tools. Experience with architecture, design, and implementation of modern server platforms with x86 and ARM architectures. Demonstrated experience debugging and root-causing complex issues with potential hardware and software causes. Systematic problem-solving, strong communication, ownership, and drive. Preferred Qualifications

Experience as technical lead on a large-scale cloud service. Hands-on experience developing and maintaining services on a public cloud platform (e.g., AWS, Azure, Oracle). Experience with AI accelerator chips (e.g., SambaNova, Groq). Knowledge of AI accelerator benchmarks and tools (e.g., MLPerf, DeepBench). Understanding of AI model optimization techniques for hardware acceleration. Experience running firmware and system diagnostics tooling using BMC firmware, UEFI/BIOS and Linux tools; scripting to customize tests. Compensation and Benefits

US: Hiring Range in USD from: $96,800 - $251,600 per year. May be eligible for bonus, equity, and compensation deferral. Oracle offers a comprehensive benefits package including medical, dental, vision, disability, life insurance, 401(k) matching, paid time off, holidays, sick leave, parental leave, adoption assistance, stock purchase plan, and more. About Us

As a world leader in cloud solutions, Oracle uses tomorrow’s technology to tackle today’s challenges. We’re committed to an inclusive workforce and providing opportunities for all. Oracle is an Equal Employment Opportunity Employer. For accessibility accommodations, contact accommodation-request_mb@oracle.com or +1 888 404 2494 in the United States.

#J-18808-Ljbffr