Unify

Staff Site Reliability Engineer, Tech Lead

Unify, New York, New York, us, 10261

About Unify Unify was founded January 17th, 2023 by Austin Hughes and Connor Heggie. Prior to Unify, Austin led

Ramp’s

growth product team focused on new customer acquisition, and Connor was a machine learning research engineer at

Scale AI . The rest of our team comes from companies like

Airbnb ,

Spotify ,

Bridgewater

and

LinkedIn .

Our mission is to build the first system-of-action for go-to-market teams, starting with an end to end platform powering warm outbound. Today, outbound sales is dominated by cold, mass outreach that floods people's inboxes and converts to deals at a tiny rate. We’re building a platform to power warm outbound, allowing go-to-market teams to get in touch with the right people at the exact time they’re looking for a solution.

We've grown revenue 8x year-over-year, and are already serving customers like Guru, Justworks, Together.AI, Flock Safety, Hightouch and more. We’re a high energy, high intensity team and we’ve raised $58M from Thrive, Emergence, OpenAI and others. Come join us in changing how go-to-market works.

About the Role Unify is redefining go-to-market with state-of-the-art AI. As our Staff SRE Tech Lead, you'll own the reliability and scalability of our platform as we add terabytes of data monthly and onboard customers with demanding uptime requirements. You'll set the technical direction for reliability engineering, lead a pod of SREs, and partner directly with the engineering leadership to build the systems and practices that keep Unify fast and reliable at scale.

What You'll Do

Lead the SRE pod:

Set technical direction, drive prioritization, and mentor engineers—ensuring the team is tackling the highest-leverage reliability and scalability challenges.

Scale our data infrastructure:

Architect and extend our ClickHouse and PostgreSQL deployments to handle terabytes of new data monthly; designing partitioning strategies, tuning queries, and building resilient replication and failover systems.

Improve system performance:

Profile and optimize critical paths across our backend services, identify bottlenecks in data pipelines and API layers, and ship changes that meaningfully improve latency and throughput.

Build for reliability:

Design and implement rate limiting, circuit breakers, graceful degradation, and other patterns that keep the platform stable under load and during partial failures.

Automate everything:

Drive tooling that eliminates toil—automating deployments, scaling operations, backup verification, and incident remediation.

Instrument and observe:

Build out distributed tracing, metrics, and alerting that give engineers clear visibility into system behavior and make debugging production issues fast.

Define and enforce SLOs:

Establish reliability targets aligned with customer needs, manage error budgets, and drive architectural decisions that balance shipping speed with system stability.

Who You Are

8+ years of software engineering experience with a strong backend foundation, including 3+ years focused on reliability, infrastructure, or platform work.

Experience leading teams or pods—setting technical direction, mentoring engineers, and driving execution on complex projects.

Deep expertise operating databases at scale, including schema design, query optimization, replication, and failover strategies.

Strong programming skills (Typescript, Python, Go, or similar) with a track record of building automation and tooling that meaningfully reduces operational burden.

Collaborative, low-ego attitude with a history of leveling up the people around you.

#J-18808-Ljbffr