Logo
Stefanini Group

Senior Cloud Reliability Engineer

Stefanini Group, Richmond, Virginia, United States, 23214

Save Job

Join to apply for the

Senior Cloud Reliability Engineer

role at

Stefanini Group Stefanini is looking for a Senior Cloud Reliability Engineer across various locations in the USA (Hybrid). This position is for W2 candidates only. Position Summary

As the Senior Cloud Reliability Engineer in the SRE Service, you will be accountable for implementing reliability practices with software as a means for the cloud foundational product line. The SRE Service is part of the Cloud Solutions & Services department and has overall responsibility for reliability of the numerous cloud foundational environments. What will be expected of you:

Work as part of cloud foundational platform squads focused on Cloud Networks to demonstrate and champion site reliability culture and practices and exert technical influence throughout your team. Develop and maintain automations, scripts and code associated with automating manual work, improving reliability and stability of the cloud platform. Develop, integrate and maintain synthetics (canaries) code to establish health of the services. Lead SLI, SLO, and error budget efforts in collaboration with product team to instrument, visualize and proactively manage the stability of cloud platforms. Implement observability (logs, metrics, traces) and monitoring for Cloud Network components like VPC, VPN Tunnels, GWLB, and Transit Gateway using tools such as SevOne, Grafana, Dynatrace, AWS CloudWatch, and AWS Canary. Respond to and resolve incidents in a timely manner. Use Infrastructure as Code (IaC) tools such as Terraform to manage AWS resources. Develop reusable artifacts and software utilities to industrialize SRE practices. Perform other duties as assigned. Qualifications

5‑7 years of extensive experience in end‑to‑end enterprise software development life cycle, including maintenance and support. 3+ years of experience in observability and SRE practices. 3+ years of experience in cloud networking (routers, firewalls, load balancers, etc.). Bachelor’s degree in computer science, information systems, or equivalent background. Extensive knowledge and experience working in AWS environments. Knowledge of Azure is a plus. Strong software development experience in cloud with Python or GoLang. Experience with observability, OpenTelemetry, and tools such as Dynatrace, Prometheus, Grafana, AWS CloudWatch, AWS Canary, AWS EventBridge. Expertise in automating TOIL. Experience in Agile and Scaled Agile environments. Experience supporting infrastructure for large multi‑service applications. Knowledge of secure coding standards and banking environment is a plus. Desirable to have AWS certifications (AWS Certified Solutions Architect and AWS Certified SysOps Administrator). Listed salary ranges may vary based on experience, qualifications, and local market. Some positions may include bonuses or other incentives.

#J-18808-Ljbffr