TechWish
Job Description:
Working closely with a wide range of container automation tooling such as Kubernetes and AWS EKS Design, implement, and maintain a secure scalable compute platform as it evolves with the industry Champion SRE methodologies around monitoring, alerting, and establishing SLOs, SLAs Identify and execute on opportunities to optimize existing systems, improve infrastructure and eliminate work through automation Work alongside other teams in helping provide post mortem analysis of why services broke or became degraded. Design and build automation suites to streamline operational support. Good understanding of CNCF tools like ArgoCD, Crossplane and Kyverno Established understanding of observability fundamentals (Logging, Metrics, Tracing) Ability to learn quickly, master our existing systems and identify areas of improvement Have a strong technical background and ability to think creatively to solve problems. Acquainted with Kubernetes Operators, Controllers and CRDs functionalities Participate in our on-call rotation for production services we build Deep understanding and application of computer science fundamentals: data structures, algorithms, and design patterns. You have exposure to and understanding of cloud (AWS, Google Cloud, Azure, etc.) architectures/services. Excellent understanding of Multi cluster management, operating at Scale Established understanding of observability fundamentals (Logging, Metrics, Tracing)
Working closely with a wide range of container automation tooling such as Kubernetes and AWS EKS Design, implement, and maintain a secure scalable compute platform as it evolves with the industry Champion SRE methodologies around monitoring, alerting, and establishing SLOs, SLAs Identify and execute on opportunities to optimize existing systems, improve infrastructure and eliminate work through automation Work alongside other teams in helping provide post mortem analysis of why services broke or became degraded. Design and build automation suites to streamline operational support. Good understanding of CNCF tools like ArgoCD, Crossplane and Kyverno Established understanding of observability fundamentals (Logging, Metrics, Tracing) Ability to learn quickly, master our existing systems and identify areas of improvement Have a strong technical background and ability to think creatively to solve problems. Acquainted with Kubernetes Operators, Controllers and CRDs functionalities Participate in our on-call rotation for production services we build Deep understanding and application of computer science fundamentals: data structures, algorithms, and design patterns. You have exposure to and understanding of cloud (AWS, Google Cloud, Azure, etc.) architectures/services. Excellent understanding of Multi cluster management, operating at Scale Established understanding of observability fundamentals (Logging, Metrics, Tracing)