Logo
Oracle

File Systems Software Engineer

Oracle, Santa Clara, California, us, 95053

Save Job

Are you interested in delivering large-scale, high performance, fault tolerant solutions? Oracle’s Cloud Infrastructure team is building a next generation Infrastructure-as-a-Service that supports the most demanding mission-critical customer requirements, and operate at cloud scale to provide a secure, distributed multi-tenant cloud environment. We're looking for hands-on engineers with a passion for solving difficult problems in distributed systems, virtualized infrastructure, and highly available services. Joining Oracle will give you the opportunity to design and build innovative new systems from the ground up and operate services at scale. Our engineers have significant technical and business impact while delivering critical enterprise level features. Job Description

As a Principal Member of Technical Staff, you will work with senior architects and product management to define requirements for OCI’s upcoming AI/ML storage infrastructure services. You should have deep experience with parallel filesystems operating in large scale Linux environments, ideally possessing a working understanding of the Lustre architecture and codebase, with experience troubleshooting issues, modifying code, or contributing improvements back to the Lustre git tree. Expertise in one or more Public Cloud offerings is a plus. You will be expected to make substantial contributions towards our design and architecture, implement proof of concepts, and demonstrate excellent communication skills by clearly explaining complex technical concepts. As a technical leader, you will mentor junior engineers, review code, and write test automations. Valuing simplicity and scale, you should be comfortable working in a collaborative, agile environment and eager to learn. Career Level - IC4 Qualifications 6+ years experience delivering and operating large scale, highly available distributed systems (more experience preferred) Substantial code-level experience with filesystems in large scale Linux environments Strong proficiency with C and C++. Python and/or Java is a plus Expertise in one or more Public Cloud offerings (OCI, AWS, GCP, Azure) is a plus Experience with high-throughput I/O architectures like DAOS/SPDK is a strong plus Background in RDMA and high-performance networking (SmartNICs, NVMe/TCP, RoCEv2) is a plus Familiarity with AI/ML frameworks (TensorFlow/Keras, PyTorch, Scikit-Learn, XGBoost, Caffe) and MLOps/Kubernetes is a plus Lustre experience is highly desirable Strong knowledge of data structures, algorithms, operating systems, and distributed systems fundamentals Strong troubleshooting and performance tuning skills Self-motivated and able to thrive in a fast-paced environment Bachelor's or Master's degree in Computer Science, Computer Engineering, or related field This role will generally accept applications for at least three calendar days from the posting date or as long as the job remains posted.

#J-18808-Ljbffr