Logo
Dale WorkForce Solutions

Senior Site Reliability Engineer

Dale WorkForce Solutions, Seattle, Washington, us, 98127

Save Job

Job Description A unique opportunity to join a rapidly growing world‑class team to engineer cutting‑edge storage infrastructure that make up a major cloud provider. As part of the SRE team, you will solve interesting technical challenges by defining, designing, deploying and troubleshooting the Object Storage system. The Object Storage system is a highly durable and available regional service for data plane, control plane, and virtualization of object storage. You will play a critical role in ensuring the service is reliable, scalable, resilient, secure, and performant.

Responsibilities

Develop object storage systems, code, and automation for scaling deployment and mission‑critical operations for the cloud provider which are deployed across data centres worldwide.

Engineer storage systems that are resilient eliminating single points of failure and develop test automation to promote reliability, security, scale and performance.

Perform engineering activities to bootstrap new storage systems and work with cross‑functional teams to build regional services.

Understand the end‑to‑end configuration, technical dependencies, and overall behavioural characteristics of production services to provide incident response and on‑call support for production systems.

Utilise a deep understanding of the service topology and their dependencies required to troubleshoot issues and define mitigations.

capacity planning for object storage systems that plan for performance and efficiency targets.

Collect system data to drive and make decisions to achieve monitoring and availability metrics.

Adhere to change management and deployment procedures for multiple storage systems and data centres worldwide.

Support new product introduction activities and decommission legacy storage systems to promote health of the fleet.

Articulate technical characteristics of object storage systems and guide development teams to engineer and add premier capabilities to the object storage service portfolio.

Qualifications

Senior level software development proficiency to develop systems, automation and debug production / mission‑critical systems.

Strong proficiency in Java, Python and Shell scripting.

Expertise in Linux systems internals and advanced system administration, and performance tuning skills.

Willingness to participate in a 24/7 on‑call schedule with customer notifications and escalations.

Deep understanding of networking protocols.

Real‑world experience with production architectures, scalability and system design with cloud computing and storage design patterns.

Strong methodical approach to troubleshooting large, complex, interconnected systems.

Familiar with best practices in change management, continuous integration and deployment.

Bachelor’s or Master’s degree in Computer Science or related field.

#J-18808-Ljbffr