Saransh Inc
Site Reliability Engineer (SRE) - Full Time
Saransh Inc, Atlanta, Georgia, United States, 30383
Site Reliability Engineer (SRE) - Full Time
Role: Site Reliability Engineer (SRE)
Location: Atlanta, GA / Bellevue, WA / Frisco, TX / Overland Park, KS (Onsite from Day 1)
Job Type: Full Time
Responsibilities
Design, build, and maintain reliable, scalable, and secure cloud-based infrastructure (AWS, Azure, or GCP).
Develop and improve observability using monitoring, logging, and tracing tools (e.g., Prometheus, Grafana, ELK, Datadog, etc.).
Automate repetitive tasks and infrastructure using Infrastructure-as-Code (Terraform, CloudFormation, Pulumi).
Create and maintain CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins, ArgoCD, etc.) to support fast and safe delivery.
Lead incident response, root cause analysis, and postmortems to ensure high uptime and rapid recovery.
Optimize system performance, reliability, and cost-effectiveness through proactive monitoring and tuning.
Collaborate with software engineering teams to define SLAs/SLOs and improve service reliability.
Implement and maintain security best practices across environments (e.g., secrets management, IAM, firewalls, etc.).
Maintain disaster recovery plans, backups, and high-availability strategies.
Required
5+ years of experience as an SRE, DevOps Engineer, or similar role.
Proficiency in scripting and automation (Bash, Python, Go, etc.).
Strong experience with containerization and orchestration (Docker, Kubernetes, Helm).
Solid understanding of Linux systems administration and networking fundamentals.
Experience with cloud platforms (AWS, Azure, or GCP).
Experience with IaC tools like Terraform or CloudFormation.
Familiarity with GitOps and modern deployment practices.
Hands-on experience with observability tools (e.g., Prometheus, Grafana, Datadog).
Strong troubleshooting and incident response skills.
Preferred
Experience in a high-traffic, microservices-based architecture.
Exposure to service meshes (Istio, Linkerd).
Certifications (AWS Certified DevOps Engineer, CKA, etc.)
Experience with security automation and compliance (e.g., SOC2, ISO27001).
Note: Visa Independent candidates are preferred
Seniority level Mid-Senior level
Employment type Full-time
Job function Engineering and Information Technology
Industries IT Services and IT Consulting
Referrals increase your chances of interviewing at Saransh Inc by 2x
Get notified about new Site Reliability Engineer jobs in
Atlanta, GA .
#J-18808-Ljbffr
Location: Atlanta, GA / Bellevue, WA / Frisco, TX / Overland Park, KS (Onsite from Day 1)
Job Type: Full Time
Responsibilities
Design, build, and maintain reliable, scalable, and secure cloud-based infrastructure (AWS, Azure, or GCP).
Develop and improve observability using monitoring, logging, and tracing tools (e.g., Prometheus, Grafana, ELK, Datadog, etc.).
Automate repetitive tasks and infrastructure using Infrastructure-as-Code (Terraform, CloudFormation, Pulumi).
Create and maintain CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins, ArgoCD, etc.) to support fast and safe delivery.
Lead incident response, root cause analysis, and postmortems to ensure high uptime and rapid recovery.
Optimize system performance, reliability, and cost-effectiveness through proactive monitoring and tuning.
Collaborate with software engineering teams to define SLAs/SLOs and improve service reliability.
Implement and maintain security best practices across environments (e.g., secrets management, IAM, firewalls, etc.).
Maintain disaster recovery plans, backups, and high-availability strategies.
Required
5+ years of experience as an SRE, DevOps Engineer, or similar role.
Proficiency in scripting and automation (Bash, Python, Go, etc.).
Strong experience with containerization and orchestration (Docker, Kubernetes, Helm).
Solid understanding of Linux systems administration and networking fundamentals.
Experience with cloud platforms (AWS, Azure, or GCP).
Experience with IaC tools like Terraform or CloudFormation.
Familiarity with GitOps and modern deployment practices.
Hands-on experience with observability tools (e.g., Prometheus, Grafana, Datadog).
Strong troubleshooting and incident response skills.
Preferred
Experience in a high-traffic, microservices-based architecture.
Exposure to service meshes (Istio, Linkerd).
Certifications (AWS Certified DevOps Engineer, CKA, etc.)
Experience with security automation and compliance (e.g., SOC2, ISO27001).
Note: Visa Independent candidates are preferred
Seniority level Mid-Senior level
Employment type Full-time
Job function Engineering and Information Technology
Industries IT Services and IT Consulting
Referrals increase your chances of interviewing at Saransh Inc by 2x
Get notified about new Site Reliability Engineer jobs in
Atlanta, GA .
#J-18808-Ljbffr