Logo
HappyRobot

Site Reliability Engineer — Scale & Resilience for AI Ops

HappyRobot, San Francisco, California, United States

Save Job

A high-growth AI startup in San Francisco is seeking a Site Reliability Engineer to lead the scaling of operational resilience. In this role, you will own system stability and debugging workflows while tackling complex failures and enhancing proactive operations. Ideal candidates will have over 3 years of experience in debugging production systems, strong problem-solving skills, and familiarity with tools like Datadog and Prometheus. Join a dynamic team dedicated to redefining enterprise operations with cutting-edge AI technology. #J-18808-Ljbffr