Alibaba Cloud
Cloud Native/Middleware Reliability Engineer (SRE)-Middleware
Alibaba Cloud, Sunnyvale, California, United States, 94087
Join to apply for the
Cloud Native/Middleware Reliability Engineer (SRE)-Middleware
role at
Alibaba Cloud .
Job Description The Alibaba Cloud Cloud-Native Middleware team is responsible for the research and development of distributed software infrastructure, delivering API Gateway and microservices solutions to enterprise customers, accelerating cloud migration and innovation.
Responsibilities
Oversee stability, performance tuning, and high-availability architecture for Microservices (Zookeeper/Nacos), ensuring 24/7 reliability.
Manage containerized middleware lifecycle on Kubernetes: deployments, auto-scaling, upgrades, resource optimization.
Lead troubleshooting of middleware incidents using logs, tracing, and monitoring systems.
Develop diagnostic tools in Java/Go for resolving production issues.
Build automation tools in Python/Go/Shell for deployment, monitoring, and disaster recovery.
Implement chaos engineering, capacity planning, and failover mechanisms.
Collaborate on cloud product strategies and architecture design.
Create technical documentation and standardize middleware operations.
Qualifications
Bachelor's+ in Computer Science with 3+ years in SRE/middleware operations.
Deep understanding of SRE principles and balancing reliability with engineering velocity.
Proven ability to diagnose complex distributed system failures.
Excellent communication skills for cross-team collaboration.
Experience modifying middleware source code for performance.
Kubernetes certifications (CKA/CKAD) or cloud provider certifications.
Additional Information Salary range: $104,400 - $171,000/year, with potential variations based on experience and location.
Employment type: Full-time, Entry level.
#J-18808-Ljbffr
Cloud Native/Middleware Reliability Engineer (SRE)-Middleware
role at
Alibaba Cloud .
Job Description The Alibaba Cloud Cloud-Native Middleware team is responsible for the research and development of distributed software infrastructure, delivering API Gateway and microservices solutions to enterprise customers, accelerating cloud migration and innovation.
Responsibilities
Oversee stability, performance tuning, and high-availability architecture for Microservices (Zookeeper/Nacos), ensuring 24/7 reliability.
Manage containerized middleware lifecycle on Kubernetes: deployments, auto-scaling, upgrades, resource optimization.
Lead troubleshooting of middleware incidents using logs, tracing, and monitoring systems.
Develop diagnostic tools in Java/Go for resolving production issues.
Build automation tools in Python/Go/Shell for deployment, monitoring, and disaster recovery.
Implement chaos engineering, capacity planning, and failover mechanisms.
Collaborate on cloud product strategies and architecture design.
Create technical documentation and standardize middleware operations.
Qualifications
Bachelor's+ in Computer Science with 3+ years in SRE/middleware operations.
Deep understanding of SRE principles and balancing reliability with engineering velocity.
Proven ability to diagnose complex distributed system failures.
Excellent communication skills for cross-team collaboration.
Experience modifying middleware source code for performance.
Kubernetes certifications (CKA/CKAD) or cloud provider certifications.
Additional Information Salary range: $104,400 - $171,000/year, with potential variations based on experience and location.
Employment type: Full-time, Entry level.
#J-18808-Ljbffr