Positron

IT Systems Administrator

Positron, Spokane, Washington, United States, 99254

Positron.ai specializes in developing custom hardware systems to accelerate AI inference. These inference systems offer significant performance and efficiency gains over traditional GPU-based systems, delivering advantages in both performance per dollar and performance per watt. Positron exists to create the world's best AI inference systems.

About the role We’re hiring an IT Systems Administrator to own the on‑prem environment that powers AI inference systems and an on‑prem compute cluster reliable, secure, and observable; support remote access (via VPN) for distributed teammates; and be the hands‑on owner of server room operations, storage, networking, virtualization, provisioning, and monitoring. This is a high‑impact IC role with broad scope across hardware, software, and documentation.

What you'll do

Server room operations:

Rack/unrack servers and network gear; manage cabling; configure PDUs; maintain accurate inventories and diagrams

Storage & backups:

Operate and harden NAS; manage NFS exports/mounts; implement/test backup/restore; enforce access controls

Networking:

Configure/maintain switches, routers, APs, and firewalls; manage VLANs, VPNs (incl. IPsec), DNS/DHCP/IPAM; monitor performance and security; troubleshoot connectivity; manage primary/backup ISPs; support Tailscale access

Provisioning & config management:

Maintain PXE/kickstart/UEFI workflows; automate OS/app configuration with Ansible; keep golden images and templates current

Cluster & job infrastructure:

Monitor cluster utilization and job health; troubleshoot failures/performance issues; plan/execute software and hardware upgrades

Virtualization:

Administer Proxmox (or similar); create/manage VMs and templates; monitor host/guest performance; triage virtualization issues

Observability & incident response:

Operate Prometheus/Grafana (and related exporters/alerts); create actionable alerts; analyze trends; run incident comms and postmortems; schedule and report maintenance windows

Documentation & process:

Maintain runbooks, SOPs, topology maps, and asset records (make/model/SN/tags/location/usage); champion repeatable, auditable operations

Qualifications

5+ years administering Linux systems in a mixed on‑prem environment (servers, switches/firewalls, NAS, SAN). Strong in Ethernet/IP, VLANs, firewalls/VPNs, DNS/DHCP/NTP; confident with

Ansible ,

PXE ,

Bash , and

Git

Hands‑on with

NFS/NAS , snapshots/replication, and backup/restore drills

Experience with

virtualization

(Proxmox/KVM/ESXi), VM templating, and host lifecycle management

Monitoring/alerting with

Prometheus/Grafana

(or equivalent), plus log collection and dashboarding

Clear documentation habits; steady incident responder with on‑call experience

Nice to have

Tailscale administration; IPsec tunnels; Proxmox clustering and Ceph; L2/L3 switch config (e.g., VLAN trunks, LACP); Terraform; secrets management; hardware automation (Redfish/IPMI)

Familiarity with SLURM or job schedulers; GPU server care and feeding; basic Python for ops tooling

Work mode & physical requirements

Must work in Spokane, WA facility for racking, wiring, and inventories

Ability to lift/move ~50 lb servers; follow ESD and safety best practices

Why this role matters Your work keeps our engineers productive and our systems dependable—shortening time-to-result for ASIC/FPGA/DV and software teams while raising our security, reliability, and velocity.

Equal Opportunity Employer. If you’re excited about the role but don’t meet every bullet, we’d

#J-18808-Ljbffr