Delaware Staffing
Principal Software Engineer
OCI is Oracle's next-generation cloud platform, built for the most demanding enterprise workloads. The AI Platform, Services & Solutions organization within OCI is building a robust ecosystem to support the end-to-end lifecycle of AI and machine learning workloads. We are looking for a Principal Software Engineer to join our growing team and help shape the future of AI infrastructure and services at Oracle. You will work on critical components of OCI's AI platform, including high-scale GPU cluster management, self-service ML infrastructure, and model serving systems. Work on critical AI infrastructure that powers Oracle's GenAI and ML initiatives. Contribute to high-impact projects with visibility across Oracle Cloud. Collaborate with top engineers and researchers in a fast-paced, innovation-driven environment. Grow your career in a supportive, mission-driven team building the future of enterprise AI. As a Principal Software Engineer on the team, you will work with teams of software engineers responsible for the software design, development, and operations for our new and existing features. You should be able to architect broad systems interactions, be hands-on, and be able to dive deep into any part of the stack and have a good sense of cloud infrastructure and networking knowledge. You should be able to work seamlessly in a collaborative, agile environment, and be excited to learn. IC4s work independently and provide technical leadership to the broader organization. You should have experience developing and operating high-scale services, and an understanding of how to make these cloud-scale services resilient, balance speed and quality with iterative and incremental improvements. Understand operational excellence and know-how to infuse a culture of being proactive within your team. Recommend and justify major changes to new and existing products and establish consensus with data-driven approaches. Build cloud service on top of the modern Infrastructure as a Service (IaaS) building blocks at OCI. Design and build distributed, scalable, fault tolerant software systems. Participate in the entire software lifecycle development, testing, CI and production operations. Design and lead software projects without needing significant guidance and guide/mentor/coach junior engineers. Balance between product feature development and production operational concerns like writing runbooks, ops automation, structured logging, instrumentation for metrics and events. Leverage internal tooling at OCI to develop, build, deploy and troubleshoot software. Participate in on-call for the service with the team. 8+ years of experience shipping scalable, cloud native distributed systems. BS in Computer Science or equivalent experience. Proficient in Go, Java, Python and shell scripting tools. Experience with container orchestration like Kubernetes/Docker Swarm. Experienced at building highly available services, possessing knowledge of common service-oriented design patterns and service-to-service communication protocols. Experience with components of modern infrastructure like containerization, software-defined networking. Experience with production operations and best practices for putting quality code in production and troubleshoot issues when they arise. Able to effectively communicate technical ideas verbally and in writing (technical proposals, design specs, architecture diagrams and presentations). MS in Computer Science preferred. Experience building control plane/data plane solutions for cloud native companies. Experience in diagnosing, troubleshooting and resolving performance issues in complex environments. Deep understanding of Unix-like operating systems. Production experience with Cloud and ML technologies. Generative AI, LLM, Machine learning experience.
OCI is Oracle's next-generation cloud platform, built for the most demanding enterprise workloads. The AI Platform, Services & Solutions organization within OCI is building a robust ecosystem to support the end-to-end lifecycle of AI and machine learning workloads. We are looking for a Principal Software Engineer to join our growing team and help shape the future of AI infrastructure and services at Oracle. You will work on critical components of OCI's AI platform, including high-scale GPU cluster management, self-service ML infrastructure, and model serving systems. Work on critical AI infrastructure that powers Oracle's GenAI and ML initiatives. Contribute to high-impact projects with visibility across Oracle Cloud. Collaborate with top engineers and researchers in a fast-paced, innovation-driven environment. Grow your career in a supportive, mission-driven team building the future of enterprise AI. As a Principal Software Engineer on the team, you will work with teams of software engineers responsible for the software design, development, and operations for our new and existing features. You should be able to architect broad systems interactions, be hands-on, and be able to dive deep into any part of the stack and have a good sense of cloud infrastructure and networking knowledge. You should be able to work seamlessly in a collaborative, agile environment, and be excited to learn. IC4s work independently and provide technical leadership to the broader organization. You should have experience developing and operating high-scale services, and an understanding of how to make these cloud-scale services resilient, balance speed and quality with iterative and incremental improvements. Understand operational excellence and know-how to infuse a culture of being proactive within your team. Recommend and justify major changes to new and existing products and establish consensus with data-driven approaches. Build cloud service on top of the modern Infrastructure as a Service (IaaS) building blocks at OCI. Design and build distributed, scalable, fault tolerant software systems. Participate in the entire software lifecycle development, testing, CI and production operations. Design and lead software projects without needing significant guidance and guide/mentor/coach junior engineers. Balance between product feature development and production operational concerns like writing runbooks, ops automation, structured logging, instrumentation for metrics and events. Leverage internal tooling at OCI to develop, build, deploy and troubleshoot software. Participate in on-call for the service with the team. 8+ years of experience shipping scalable, cloud native distributed systems. BS in Computer Science or equivalent experience. Proficient in Go, Java, Python and shell scripting tools. Experience with container orchestration like Kubernetes/Docker Swarm. Experienced at building highly available services, possessing knowledge of common service-oriented design patterns and service-to-service communication protocols. Experience with components of modern infrastructure like containerization, software-defined networking. Experience with production operations and best practices for putting quality code in production and troubleshoot issues when they arise. Able to effectively communicate technical ideas verbally and in writing (technical proposals, design specs, architecture diagrams and presentations). MS in Computer Science preferred. Experience building control plane/data plane solutions for cloud native companies. Experience in diagnosing, troubleshooting and resolving performance issues in complex environments. Deep understanding of Unix-like operating systems. Production experience with Cloud and ML technologies. Generative AI, LLM, Machine learning experience.