Logo
SPECTRAFORCE

Site Reliability Engineer

SPECTRAFORCE, San Diego, California, United States, 92189

Save Job

Senior Recruiter at SPECTRAFORCE Technologies Role: Site Reliability Engineer

Duration: 12 Months

NOTE - ONLY W2

Job Description:

The Site Reliability Engineer (SRE) will work closely with cross‑functional teams, including software development, platform, and operations, to support the availability and performance of our cloud‑based systems. You will take ownership of the cloud infrastructure, support automation and implement monitoring and alerting systems to proactively manage issues.

Key Responsibilities • Design, deploy, and maintain scalable, secure, and highly available cloud infrastructure on AWS and Azure.

• Proficient in infrastructure‑as‑code (Terraform, AWS CDK and CloudFormation) and scripting languages (TypeScript, PowerShell or Go‑Lang).

• Ensure cloud environments adhere to regulatory standards for healthcare data security and familiarity with (e.g., SOC II and ePHI compliance).

Observability and Monitoring Implement, configure, and optimize Datadog for application and infrastructure monitoring, ensuring full‑stack visibility into system performance.

Set up alerting mechanisms for critical metrics (e.g., system health, latency, error rates) and establish runbooks for incident response.

Develop and maintain dashboards to provide real‑time insights into system performance.

Performance Optimization & Troubleshooting Identify and resolve performance bottlenecks and ensure the reliability and scalability of production systems.

Perform root cause analysis for incidents and participate in on‑call rotations to manage critical system incidents.

Drive improvements to system architecture, security, and disaster recovery strategies.

Work closely with development teams to incorporate CI/CD pipelines and foster a culture of “infrastructure as code” and automation.

Collaborate with security and compliance teams to ensure systems meet all regulatory and security requirements.

Promote best practices for software delivery, system monitoring, and infrastructure scalability.

Security & Compliance Work with the compliance and cybersecurity teams to maintain healthcare data security, ensuring that systems are SOC II and ePHI compliant.

Implement security best practices within cloud environments, including encryption, IAM, and regular audits.

Qualifications Bachelor’s degree in Computer Science, Engineering, or related field, or equivalent practical experience.

3+ years of experience as a Site Reliability Engineer, managing infrastructure on AWS and/or Azure.

Experience with monitoring and observability tools (Prometheus, Grafana, Datadog, etc.).

Expertise in Terraform, CloudFormation, AWS CDK or similar infrastructure‑as‑code technologies.

Proficiency in container orchestration and management (e.g., Docker, Kubernetes).

Knowledge of automation tools (e.g., Ansible, Puppet, Chef).

Familiarity with CI/CD pipeline tools such as Jenkins, GitHub Actions, or Azure DevOps.

Experience with healthcare data security and compliance (e.g., SOC II and ePHI requirements) is a plus.

Excellent problem‑solving and troubleshooting skills.

Strong collaboration and communication skills.

Nice to Have Experience working in a regulated industry, particularly healthcare or medical devices.

Certifications such as AWS Certified Solutions Architect, Azure Administrator, or Certified Kubernetes Administrator (CKA).

Experience with AI/ML models for predictive maintenance and performance monitoring.

Familiarity with serverless architectures (e.g., AWS Lambda, Azure Functions).

Any Additional Information Strong analytical and decision‑making abilities

Able to build strong partnership with business partners and the project teams

Takes responsibility for delivering superior value and client service

Works well with people who have diverse abilities, experiences, and perspectives

Influences others without direct authority

Approaches opportunities and issues with an optimistic, action‑oriented, and solution‑based approach.

Good writing skills to document plans and process

Seniority level

Mid‑Senior level

Employment type

Contract

Job function

Information Technology

Industries

Hospitals and Health Care

#J-18808-Ljbffr