Logo
Pantera Capital

SRE for AI HPC Systems - Reliability & Scale

Pantera Capital, Palo Alto, California, United States, 94306

Save Job

A forward-thinking technology firm in Palo Alto seeks a Site Reliability Engineer to ensure the reliability and performance of their HPC infrastructure powering AI research. The role demands collaboration with cross-functional teams and responsibilities include designing scalable systems and troubleshooting complex issues. Ideal candidates should have 3+ years in SRE, DevOps, or systems engineering, and a strong background in Linux, scripting, and cloud technologies. The position offers a competitive salary range of $180,000 - $440,000 USD along with comprehensive benefits. #J-18808-Ljbffr