Logo
NxT Level

Sr. SRE, Compute Infrastructure

NxT Level, Boston, Massachusetts, United States, 02108

Save Job

Job Description

Job Description

Senior Site Reliability Engineer – Compute Infrastructure Location: Boston, MA (Hybrid – Tues–Fri Onsite | Mondays Remote) Compensation: $134,250 – $214,800 + Bonus + Equity + Full Benefits We are representing a cutting-edge technology company that is seeking a

Senior Site Reliability Engineer (SRE)

to join their global infrastructure team. In this role, you'll play a critical part in scaling and optimizing the organization's

cloud-native Kubernetes platform —the backbone for internal engineering teams delivering high-impact applications and services. This role is ideal for an SRE who thrives in complex distributed environments, is passionate about developer enablement, and enjoys building robust systems that balance performance, reliability, and scalability. Why You Should Apply: You'll work on global, mission-critical systems running on modern cloud infrastructure

High autonomy in a fast-paced, high-impact engineering environment

Opportunity to shape SRE best practices across the org

Hybrid work culture that values face-to-face collaboration and innovation

What You'll Do: Architect and scale

cloud-native Kubernetes infrastructure

to support internal engineering workflows

Develop tools and platforms that empower product and infrastructure teams to

deploy and manage services rapidly and securely

Write

clean, efficient, and maintainable code

in languages such as Python, Go, C#, or Java

Use Infrastructure as Code (IaC) tools like

Terraform or Pulumi

to provision and manage cloud resources

Enhance

observability and alerting systems

using APM, metrics, and log aggregation tools

Partner with developers to

optimize CI/CD pipelines

and ensure smooth software delivery lifecycles

Provide strong documentation to promote self-service and onboarding across engineering

Continually assess and improve platform

reliability, operability, and cost-efficiency

Contribute to system design reviews and mentor junior engineers on cloud-native best practices

What You Bring: 7+ years of experience in

Platform Engineering

or

Site Reliability Engineering

Proven experience managing

Kubernetes platforms at scale

(e.g., AKS, EKS, or GKE)

Strong programming experience in

Python, Go, C#, Java, or similar languages

Deep understanding of cloud platforms like

AWS or Azure

Experience with

ArgoCD, GitHub Actions, or similar CI/CD tools

Proficiency with observability tooling (Datadog, Prometheus, Grafana, etc.)

Expertise in

networking, security protocols, and container orchestration

Familiarity with communication protocols such as

SPI, UART, RS485 , and modern interfaces like

TLS, X.509, etc.

Experience building testable, scalable IaC modules and managing

multi-environment deployments

Strong collaboration and documentation habits in cross-functional teams

Empathy for internal users and a customer-focused mindset

Benefits: Competitive base salary: $134,250 – $214,800 (based on experience & location)

Bonus + equity opportunities

Discretionary time off (DTO) policy

Paid parental leave for all caregivers

Medical, dental, and vision coverage

Fitness and wellness reimbursements

Mental health & professional development support

Hybrid workplace with in-office perks (snacks, events, and team-building activities)

Note:

Compensation and benefits may vary depending on experience level and geographic market.