Logo
Alibaba Cloud

Site Reliability Engineer

Alibaba Cloud, Sunnyvale, California, United States, 94087

Save Job

Site Reliability Engineer – Cloud Intelligence Group Cloud Intelligence Group undertakes Alibaba Group's core technologies and business innovations in the high‑tech sector, dedicated to building an enterprise‑level cloud computing service platform for the digital economy era.

The mission of the SRE team is to ensure the stability of the production environment and the reliability of enterprise‑level cloud computing data and service continuity, guaranteeing uninterrupted operation of cloud‑based customers' businesses and achieving availability exceeding 99.99 %.

Key Responsibilities

Daily operations and maintenance of applications, databases, and middleware, troubleshooting and answering customer inquiries.

Collaborate with R&D to develop critical support plans based on customer business requirements during peak periods, including preparation during the standby period, on‑duty support during critical periods, and post‑standby review.

Qualifications

A degree in Computer Science or a related field.

3 years of experience as a Site Reliability Engineer (SRE) or higher.

Proficiency with Linux environments or cloud infrastructure.

Exceptional system diagnostic and problem‑solving skills.

Strong teamwork spirit and ability to work well under pressure.

In‑depth understanding of Kubernetes or monitoring systems.

Expertise in programming languages such as Golang, Python, or Java.

Experience in designing and implementing distributed systems.

Pay range: $104,400 – $171,000 per year (may vary based on location, skills, and experience). The position is at‑will; the company reserves the right to modify base salary and other compensation at any time.

Referrals increase your chances of interviewing at Alibaba Cloud by 2×.

#J-18808-Ljbffr