Logo
Oracle

Senior Software Engineer

Oracle, Austin, Texas, us, 78716

Save Job

OCI (Oracle Cloud Infrastructure) AI Infrastructure is at the forefront of building a cutting‑edge, ultra‑high‑performance GPU platform designed to support AI/ML/HPC workloads. This role is part of the GPU Availability and Monitoring team in the Compute Org, responsible for designing and developing architectural changes for GPU delivery, health monitoring, triage automation, and diagnostic services. You will work on distributed AI/ML/HPC workloads across thousands of GPUs, leveraging technologies like RoCE and Infiniband.

Responsibilities

Work independently in ambiguous situations to ensure adherence to published standards and practices.

Design, develop, troubleshoot, and debug software programs for various cloud infrastructure components, including databases, applications, tools, and networks.

Take an active role in defining and evolving standard practices and procedures for software engineering, with a focus on AI‑driven development.

Design and develop software for tasks associated with developing, designing, and debugging software applications or operating systems, leveraging AI and ML techniques.

Lead the development of critical initiatives, including:

Design and implement spike detection mechanisms for provisioning failures to minimize operational disruptions using ML algorithms.

Expand integrations with Kafka to enable near real‑time actions supporting 1‑Day SLO objectives for hardware repairs, utilizing event‑driven architecture and stream processing.

Develop an automated ticket routing framework to streamline workflows, enhance efficiency, and reduce operational overhead, powered by NLP and ML.

Accelerate dedicated initiatives through collaborative efforts with cross‑functional teams and customers, applying AI‑driven insights and recommendations.

Harness the power of AI and ML to create innovative tools and frameworks that automate testing, simulate complex environments, and reproduce incidents, freeing up human ingenuity to focus on higher‑value tasks.

Collaborate and lead technical discussions across multiple teams to ensure seamless integrations and effective problem‑solving.

Provide direction and mentoring to junior engineers, sharing knowledge and expertise to promote growth and development.

Qualifications

Experience with Python, Java, or TypeScript.

Hands‑on experience in AI/ML, especially leveraging ML for operational monitoring and automation.

Strong background in Linux system programming and kernel‑level development.

Experience with Docker and container orchestration.

Knowledge of RESTful API design and API security.

Experience with cloud platforms (OCI, AWS, Azure, GCP) and familiar with cloud infrastructure concepts.

Ability to work independently and collaborate across teams.

Excellent problem‑solving and debugging skills.

Technical Skills

Programming languages: Python, Java, TypeScript

Development methodologies: Agile Principles

Data management: data modeling, data warehousing, data governance

Cloud infrastructure: OCI, AWS, Azure, GCP

Operating systems: Linux, macOS

Scripting languages: Bash, Perl, Ruby

Containerization: Docker

API design: RESTful APIs, API gateways, API security

AI tools: chatbots, virtual assistants, predictive analytics

Database: MySQL, caching technologies (Redis, MemoryCache)

Systems architecture: data synchronization, fault tolerance, state management

Networking: general enterprise storage, networking, and computing experience

Benefits

Medical, dental, and vision insurance

Short‑term and long‑term disability coverage

Life insurance and AD&D

Supplemental life insurance

Health care and dependent care Flexible Spending Accounts

Pre‑tax commuter and parking benefits

401(k) savings and investment plan with company match

Paid time off, holidays, and paid sick leave

Paid parental leave and adoption assistance

Employee Stock Purchase Plan

Voluntary benefits: auto, homeowner, pet insurance

Hiring Information US: Hiring range $79,200 – $178,100 per year, plus potential bonus and equity.

About Oracle Oracle is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability, or protected veterans’ status.

#J-18808-Ljbffr