Srimatrix Inc.

Sr.Platform DevOps HPC Engineer

Srimatrix Inc., Mountain View, California, us, 94039

Overview

Our client is seeking a

Sr.Platform DevOps HPC Engineer When submitting candidates please provide at minimum the following information: ____ Location:

Hybrid – 3 days onsite in Mountain View, CA with flexibility for the remainder of the days Length

3 6-month CTH Industry:

Aerospace Employment Type : only Permanent Residents Senior Staff Platform Engineer-DevOps/HPC to join our client team. The goal of a Senior Staff Platform Engineer-DevOps/HPC at our client is to design, build, and maintain secure, high-performance infrastructure that powers aerospace engineering, simulation, testing, and mission operations. You will also support a scalable CI/CD environment that spans both on-premises and Cloud Environments. Your work will directly enable rapid development, advanced modeling, and the secure operation of systems critical to our aerospace programs. Responsibilities

(Note: The original description provides responsibilities in narrative form; this section is preserved as part of the overview content to reflect role expectations that were stated.) Requirements

12+ years in Platform Engineering, SRE, or DevOps, ideally in mission-critical environments. Aerospace/defence experience. Familiarity with managing Distributed Systems, including HPC clusters (Slurm, PBS, Grid Engine). Cloud infrastructure experience (AWS, Azure, Google Cloud Platform), preferably Google Cloud Platform/AWS. Proficiency with Terraform and Ansible. Observability tools experience (Prometheus, Grafana, ELK). Strong networking, security, and system performance knowledge. CI/CD pipeline and automation experience, preferably GitLabLinux administration and troubleshooting. Scripting experience (Python, Bash, Go). Preferred: Cloud-based or hybrid HPC solutions (AWS ParallelCluster). Familiarity with NIST 800-53, FedRAMP, and aerospace security. Familiarity with storage systems and parallel filesystems (e.g., Lustre, GPFS, PANFS, NFS) in HPC setups. Experience deploying and operating Kubernetes in production environments. GPU computing exposure (CUDA, AI/ML). Relevant certifications (AWS, HashiCorp, CNCF).

#J-18808-Ljbffr