Voltage Park
Infrastructure Engineer (Observability)
Voltage Park, San Francisco, California, United States, 94199
Overview
Voltage Park is seeking an
Infrastructure Engineer
with a focus on
Observability
to join our Infrastructure Engineering team. Our engineers design and operate the systems that manage thousands of bare-metal servers, GPUs, and high-performance networks across multiple data centers. This role combines the breadth of a core infrastructure engineer with a specialty in observability and telemetry. Youll design and operate metrics, logs, traces, and alerting pipelines that provide actionable insights for both internal teams and external customers helping to ensure reliability and transparency at scale. This is a fully remote position, although candidates must be based in the continental United States. Unfortunately, we are unable to provide sponsorship for this role. Base pay range
$140,000.00/yr - $180,000.00/yr Responsibilities
Design, build, and maintain observability platforms spanning metrics, logs, traces, and events. Create dashboards and alerting for internal stakeholders (InfraOps, Engineering, Customer Success) and scoped visibility for external customers. Ingest and correlate telemetry from GPUs, CPUs, networking (Ethernet & InfiniBand), containers, APIs, and BMC/Redfish. Implement noise-resistant alerting pipelines that improve detection and reduce operational load. Collaborate with infrastructure, platform, and customer-facing teams to embed observability into workflows. Contribute to broader infrastructure engineering projects beyond observability. Qualifications
8+ years in infrastructure engineering, SRE, or observability roles. Strong experience with monitoring systems (Prometheus, Grafana, ELK, VictoriaMetrics, or similar). Proficiency in Python, Go, or Bash for automation and data integration. Familiarity with container/Kubernetes observability. Understanding of streaming telemetry pipelines (Kafka, OTEL, Promtail, or equivalent). Strong written and verbal communication skills. Ideal experiences
Experience with GPU observability, particularly NVIDIA DCGM. Designing multi-tenant observability solutions with RBAC and scoped queries. Prior work with correlation engines for RCA, forecasting, or predictive alerting. Broader exposure to infrastructure domains (networking, storage, provisioning). Culture
You enjoy working with a small, highly motivated team. Youre comfortable balancing autonomy with company-wide priorities. You value clarity, documentation, and actionable insights in observability systems. Youre excited to specialize in observability while contributing as a core infrastructure engineer. Voltage Park is an equal opportunity employer
and makes employment decisions on the basis of merit. All qualified applicants will receive consideration without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, protected veteran status, or any other characteristic protected by law. If you require an accommodation during the job application process, please notify your recruiter. Compensation Range: $140K - $180K Seniority level
Mid-Senior level Employment type
Full-time Job function
Information Technology Industries
Technology, Information and Internet
#J-18808-Ljbffr
Voltage Park is seeking an
Infrastructure Engineer
with a focus on
Observability
to join our Infrastructure Engineering team. Our engineers design and operate the systems that manage thousands of bare-metal servers, GPUs, and high-performance networks across multiple data centers. This role combines the breadth of a core infrastructure engineer with a specialty in observability and telemetry. Youll design and operate metrics, logs, traces, and alerting pipelines that provide actionable insights for both internal teams and external customers helping to ensure reliability and transparency at scale. This is a fully remote position, although candidates must be based in the continental United States. Unfortunately, we are unable to provide sponsorship for this role. Base pay range
$140,000.00/yr - $180,000.00/yr Responsibilities
Design, build, and maintain observability platforms spanning metrics, logs, traces, and events. Create dashboards and alerting for internal stakeholders (InfraOps, Engineering, Customer Success) and scoped visibility for external customers. Ingest and correlate telemetry from GPUs, CPUs, networking (Ethernet & InfiniBand), containers, APIs, and BMC/Redfish. Implement noise-resistant alerting pipelines that improve detection and reduce operational load. Collaborate with infrastructure, platform, and customer-facing teams to embed observability into workflows. Contribute to broader infrastructure engineering projects beyond observability. Qualifications
8+ years in infrastructure engineering, SRE, or observability roles. Strong experience with monitoring systems (Prometheus, Grafana, ELK, VictoriaMetrics, or similar). Proficiency in Python, Go, or Bash for automation and data integration. Familiarity with container/Kubernetes observability. Understanding of streaming telemetry pipelines (Kafka, OTEL, Promtail, or equivalent). Strong written and verbal communication skills. Ideal experiences
Experience with GPU observability, particularly NVIDIA DCGM. Designing multi-tenant observability solutions with RBAC and scoped queries. Prior work with correlation engines for RCA, forecasting, or predictive alerting. Broader exposure to infrastructure domains (networking, storage, provisioning). Culture
You enjoy working with a small, highly motivated team. Youre comfortable balancing autonomy with company-wide priorities. You value clarity, documentation, and actionable insights in observability systems. Youre excited to specialize in observability while contributing as a core infrastructure engineer. Voltage Park is an equal opportunity employer
and makes employment decisions on the basis of merit. All qualified applicants will receive consideration without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, protected veteran status, or any other characteristic protected by law. If you require an accommodation during the job application process, please notify your recruiter. Compensation Range: $140K - $180K Seniority level
Mid-Senior level Employment type
Full-time Job function
Information Technology Industries
Technology, Information and Internet
#J-18808-Ljbffr