Join to apply for the Site Reliability Engineer role at PsiQuantum
Join to apply for the Site Reliability Engineer role at PsiQuantum
Get AI-powered advice on this job and more exclusive features.
Quantum computing holds the promise of humanitys mastery over the natural world, but only if we can build a real quantum computer. PsiQuantum is on a mission to build the first real, useful quantum computers, capable of delivering the world-changing applications that the technology has long promised. We know that means we will need to build a system with roughly 1 million qubits that supports fault tolerant error correction within a scalable architecture, and a data center footprint.
By harnessing the laws of quantum physics, quantum computers can provide exponential performance increases over todays most powerful supercomputers, offering the potential for extraordinary advances across a broad range of industries including climate, energy, healthcare, pharmaceuticals, finance, agriculture, transportation, materials design, and many more.
PsiQuantum has determined the fastest path to delivering a useful quantum computer, years earlier than the rest of the industry. Our architecture is based on silicon photonics which gives us the ability to produce our components at Tier-1 semiconductor fabs such as GlobalFoundries where we leverage high-volume semiconductor manufacturing processes, the same processes that are already producing billions of chips for telecom and consumer electronics applications. We also benefit from the quantum mechanics reality that photons dont feel heat or electromagnetic interference, allowing us to take advantage of existing cryogenic cooling systems and industry standard fiber connectivity.
In 2024, PsiQuantum announced two government-funded projects to support the build-out of our first Quantum Data Centers and utility-scale quantum computers in Brisbane, Australia and Chicago, Illinois. Both projects are backed by nations that understand quantum computings potential impact and the need to scale this technology to unlock that potential. And we wont just be building the hardware, but also the fault tolerant quantum applications that will provide industry-transforming results.
Quantum computing is not just an evolution of the decades-old advancement in compute power. It provides the key to mastering our future, not merely discovering it. The potential is enormous, and we have the plan to make it real. Come join us.
Theres much more work to be done and we are looking for exceptional talent to join us on this extraordinary journey!
Job Summary
Join the OS/Platform team as a Site Reliability Engineer (SRE) and keep our services healthy, observable, and fast. Partnering with the Platform Engineering group, youll own the daytoday operation of our monitoring stackGrafana, Prometheus, Loki, and Tempocrafting dashboards that surface golden signals and drive realtime insight. Youll codify reliability through SLIs/SLOs, automate runbooks in Python, and lead incident response to maintain worldclass uptime across both onprem and AWS environments.
Responsibilities
- Define, implement, and iterate on Service Level Indicators & Service Level Objectives (SLIs/SLOs) and error budgets for critical services.
- Build and maintain Grafana dashboards that visualize golden signals (latency, traffic, errors, saturation) for engineers and stakeholders.
- Operate and tune our observability pipeline (Prometheus, Loki, Tempo) to ensure scalable, lowlatency telemetry ingestion and alerting.
- Drive incident response: triage, mitigate, perform postincident reviews, and implement preventive actions.
- Develop automation and selfservice tooling in Python/Bash to streamline alerts, runbooks, and operational tasks.
- Collaborate with Platform and Product teams on capacity planning, performance testing, and change management.
- Improve CI/CD health checks and release safety nets within GitLab.
- Contribute to infrastructure as code (Terraform, Ansible) for monitoring stack deployments and upgrades.
- Bachelors Degree or higher in Computer Science, Engineering or other related technical field.
- 5+ years in an SRE, DevOps, or Production Engineering role supporting distributed systems in production.
- Handson expertise with observability tools: Grafana, Prometheus, Loki, Tempo (or equivalent).
- Proven track record designing dashboards and alerts around golden signals and (Utilization, Saturation, Errors) USE and RED (Rate, Errors, Duration) methodologies.
- Solid scripting/automation skills in Python and Bash; familiarity with GitLab CI pipelines.
- Operational experience with Kubernetes and containerized workloads.
- Working knowledge of AWS services, networking fundamentals, and load balancing.
- Experience running incident response and writing actionable postmortems.
- Familiarity with Infrastructure as Code (Terraform, Ansible) and configuration management.
- Exposure to regulated environments and multiregion architectures is a plus.
- Strong communication and collaboration skills; comfortable acting as a generalist across infrastructure, application, and data layers.
Note: PsiQuantum will only reach out to you using an official PsiQuantum email address and will never ask you for bank account information as part of the interview process. Please report any suspicious activity to .
We are not accepting unsolicited resumes from employment agencies.
The ranges below reflect the target ranges for a new hire base salary. One is for the Bay Area (within 50 miles of HQ, Palo Alto), the second one (if applicable) is for elsewhere in the US (beyond 50 miles of HQ, Palo Alto). If there is only one range, it is for the specific location of where the position will be located. Actual compensation may vary outside of these ranges and is dependent on various factors including but not limited to a candidate's qualifications including relevant education and training, competencies, experience, geographic location, and business needs. Base pay is only one part of the total compensation package. Full time roles are eligible for equity and benefits. Base pay is subject to change and may be modified in the future.
U.S. Base Pay Range
$120,000$140,000 USD
Bay Area Pay Range
$145,000$165,000 USD
Seniority level
Seniority level
Mid-Senior level
Employment type
Employment type
Full-time
Job function
Job function
Engineering and Information TechnologyIndustries
Computer Hardware Manufacturing
Referrals increase your chances of interviewing at PsiQuantum by 2x
Get notified about new Site Reliability Engineer jobs in Palo Alto, CA .
San Francisco Bay Area $164,000.00-$204,000.00 2 weeks ago
Mountain View, CA $52.00-$60.00 1 week ago
Palo Alto, CA $160,000.00-$180,000.00 2 weeks ago
Palo Alto, CA $100,000.00-$200,000.00 2 weeks ago
Software Engineer - Mapping & Localization
San Jose, CA $130,000.00-$182,000.00 9 months ago
Fremont, CA $117,000.00-$173,000.00 1 week ago
Senior Site Reliability Engineer, ML Platforms
Santa Clara, CA $224,000.00-$425,500.00 3 days ago
Fremont, CA $147,000.00-$208,000.00 1 week ago
Santa Clara, CA $103,000.00-$165,600.00 5 days ago
Mountain View, CA $138,225.00-$207,575.00 1 week ago
Software Engineer, AI Platform - New Grad
Mountain View, CA $145,000.00-$170,000.00 1 week ago
DevOps Engineer EAST COAST RESIDENT (No international/OPT/CPT consideration for this role)
Belmont, CA $110,000.00-$145,000.00 6 hours ago
Site Reliability Engineer, Global E-Commerce
San Jose, CA $136,800.00-$259,200.00 1 week ago
AI/ML Software Engineer Intern (Data Platform) - 2025 Fall (BS/MS)
Software Engineer- Python/ Django/Linux: 5+yrs
San Jose, CA $146,600.00-$203,100.00 3 weeks ago
Software Engineer Intern (Big Data - Data Platform) - 2025 Summer/Fall (MS)
Mountain View, CA $145,000.00-$170,000.00 1 week ago
San Jose, CA $110,000.00-$230,000.00 1 week ago
San Mateo, CA $150,000.00-$185,000.00 1 week ago
San Mateo, CA $150,000.00-$185,000.00 1 week ago
Foster City, CA $160,000.00-$250,000.00 4 months ago
Santa Clara, CA $175,000.00-$195,000.00 1 month ago
New Grads 2025 - General Software Engineer
San Jose, CA $120,000.00-$165,000.00 5 months ago
San Mateo, CA $150,000.00-$185,000.00 1 week ago
San Mateo, CA $150,000.00-$185,000.00 1 week ago
Were unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
#J-18808-Ljbffr