The Phoenix Group
This range is provided by The Phoenix Group. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.
Base pay range $125,000.00/yr - $150,000.00/yr
Direct message the job poster from The Phoenix Group
We’re looking for a skilled Site Reliability Engineer to take ownership of platform reliability and operational performance. In this role, you’ll lead incident response efforts, enhance observability, and ensure uptime across our AWS cloud environment. You’ll combine strong technical acumen with a proactive mindset—focusing on automation, scalability, and operational excellence through Infrastructure as Code (IaC). You will be the primary on‑call SRE.
Key Responsibilities
Lead reliability initiatives, SLO tracking, and incident response across critical systems.
Define and enhance monitoring and observability strategies.
Manage and optimize AWS infrastructure, including ECS, Fargate, Lambda, and IAM.
Build and maintain Infrastructure as Code using tools such as Pulumi or Terraform.
Automate operational workflows using Python or Bash.
Support containerized deployments (Docker) and contribute to Kubernetes-based initiatives.
Maintain and troubleshoot Linux-based systems.
Ensure cloud environments adhere to best practices in security, performance, and cost management.
Keep documentation current for operational processes, playbooks, and infrastructure design.
Qualifications
Bachelor’s degree in Computer Science, Information Technology, or equivalent hands‑on experience.
3+ years in Site Reliability, DevOps, or Systems Engineering roles.
Experience managing production on‑call rotations and incident escalations.
Deep knowledge of Linux systems, networking, and cloud infrastructure.
Hands‑on experience with AWS core services (ECS, Fargate, Lambda, IAM).
Proficiency with Infrastructure as Code frameworks (Pulumi, Terraform, or AWS CDK).
Familiarity with observability tools like Datadog, Prometheus, New Relic, or Sentry.
Proficient scripting skills in Python or Bash.
Solid understanding of containerization and CI/CD deployment workflows.
Strong communication and documentation abilities.
Preferred Skills
Experience supporting highly available, real‑time, or AI/ML production environments.
Familiarity with Kubernetes or other orchestration technologies.
AWS or DevOps-related certifications are a plus.
The Phoenix Group Advisors is an equal opportunity employer. We are committed to creating a diverse and inclusive workplace and prohibit discrimination and harassment of any kind based on race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status. We strive to attract talented individuals from all backgrounds and provide equal employment opportunities to all employees and applicants for employment.
Seniority level Associate
Employment type Full‑time
Job function Staffing and Recruiting
Benefits
Medical insurance
Vision insurance
401(k)
#J-18808-Ljbffr
Base pay range $125,000.00/yr - $150,000.00/yr
Direct message the job poster from The Phoenix Group
We’re looking for a skilled Site Reliability Engineer to take ownership of platform reliability and operational performance. In this role, you’ll lead incident response efforts, enhance observability, and ensure uptime across our AWS cloud environment. You’ll combine strong technical acumen with a proactive mindset—focusing on automation, scalability, and operational excellence through Infrastructure as Code (IaC). You will be the primary on‑call SRE.
Key Responsibilities
Lead reliability initiatives, SLO tracking, and incident response across critical systems.
Define and enhance monitoring and observability strategies.
Manage and optimize AWS infrastructure, including ECS, Fargate, Lambda, and IAM.
Build and maintain Infrastructure as Code using tools such as Pulumi or Terraform.
Automate operational workflows using Python or Bash.
Support containerized deployments (Docker) and contribute to Kubernetes-based initiatives.
Maintain and troubleshoot Linux-based systems.
Ensure cloud environments adhere to best practices in security, performance, and cost management.
Keep documentation current for operational processes, playbooks, and infrastructure design.
Qualifications
Bachelor’s degree in Computer Science, Information Technology, or equivalent hands‑on experience.
3+ years in Site Reliability, DevOps, or Systems Engineering roles.
Experience managing production on‑call rotations and incident escalations.
Deep knowledge of Linux systems, networking, and cloud infrastructure.
Hands‑on experience with AWS core services (ECS, Fargate, Lambda, IAM).
Proficiency with Infrastructure as Code frameworks (Pulumi, Terraform, or AWS CDK).
Familiarity with observability tools like Datadog, Prometheus, New Relic, or Sentry.
Proficient scripting skills in Python or Bash.
Solid understanding of containerization and CI/CD deployment workflows.
Strong communication and documentation abilities.
Preferred Skills
Experience supporting highly available, real‑time, or AI/ML production environments.
Familiarity with Kubernetes or other orchestration technologies.
AWS or DevOps-related certifications are a plus.
The Phoenix Group Advisors is an equal opportunity employer. We are committed to creating a diverse and inclusive workplace and prohibit discrimination and harassment of any kind based on race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status. We strive to attract talented individuals from all backgrounds and provide equal employment opportunities to all employees and applicants for employment.
Seniority level Associate
Employment type Full‑time
Job function Staffing and Recruiting
Benefits
Medical insurance
Vision insurance
401(k)
#J-18808-Ljbffr