Logo
Alibaba Cloud

Cloud Native/Middleware Reliability Engineer (SRE)-Middleware

Alibaba Cloud, Sunnyvale, California, United States, 94087

Save Job

Join to apply for the

Cloud Native/Middleware Reliability Engineer (SRE)-Middleware

role at

Alibaba Cloud .

Job Description The Alibaba Cloud Cloud-Native Middleware team is responsible for the research and development of distributed software infrastructure, delivering API Gateway and microservices solutions to enterprise customers, accelerating cloud migration and innovation.

Responsibilities

Oversee stability, performance tuning, and high-availability architecture for Microservices (Zookeeper/Nacos), ensuring 24/7 reliability.

Manage containerized middleware lifecycle on Kubernetes: deployments, auto-scaling, upgrades, resource optimization.

Lead troubleshooting of middleware incidents using logs, tracing, and monitoring systems.

Develop diagnostic tools in Java/Go for resolving production issues.

Build automation tools in Python/Go/Shell for deployment, monitoring, and disaster recovery.

Implement chaos engineering, capacity planning, and failover mechanisms.

Collaborate on cloud product strategies and architecture design.

Create technical documentation and standardize middleware operations.

Qualifications

Bachelor's+ in Computer Science with 3+ years in SRE/middleware operations.

Deep understanding of SRE principles and balancing reliability with engineering velocity.

Proven ability to diagnose complex distributed system failures.

Excellent communication skills for cross-team collaboration.

Experience modifying middleware source code for performance.

Kubernetes certifications (CKA/CKAD) or cloud provider certifications.

Additional Information Salary range: $104,400 - $171,000/year, with potential variations based on experience and location.

Employment type: Full-time, Entry level.

#J-18808-Ljbffr