Logo
IBM

Lead High Performance Computing System Architect

IBM, Detroit, Michigan, United States, 48228

Save Job

Lead High Performance Computing System Architect Join to apply for the

Lead High Performance Computing System Architect

role at

IBM

GREAT OPPORTUNITY WITH IBM FEDERAL CONSULTING

Your Role And Responsibilities The Lead System Architect shall possess extensive, relevant experience in the design, installation, integration, testing, and acceptance of newly deployed High-Performance Computing systems and extensive, relevant experience in the operations, maintenance, and enhancement (i.e., expansions, refreshes, upgrades, and tenant fit-ups) of existing HPC systems. The System Architect is a technical leader responsible for the end-to-end design, implementation, and optimization of an HPC system to meet specific computational requirements. This position requires periodic travel and some evenings, weekends, and/or holidays. Job may require after-hours response to emergency issues. Periodically scheduled on-call may require after-hours response for technical emergencies not explicitly related to assigned job responsibilities.

Preferred Education None

Required Technical And Professional Expertise

Minimum of five years’ experience in Linux systems administration.

Bachelor's degree in computer science, engineering, math, or scientific discipline with 2+ years of systems engineering; or 5 years’ experience in HPC architecture.

Hands-on architecture design experience with HPC to include storage, file system, InfiniBand, security, authentication, and compute architecture with 5 years’ experience.

Knowledge of HPC hardware architectures, including processors, memory subsystems, network fabrics, and interconnects with 5 years’ experience.

Must be eligible for a federal clearance.

Public trust required but can obtain after being hired.

Preferred Technical And Professional Experience

Familiarity with HPC software stack components like compilers, runtime systems, job schedulers, and scientific libraries with 5 years’ experience.

Good understanding of storage administration and optimization, such as performing upgrades and defining RAID configurations with 5 years’ experience.

Experience with system administration and cluster management tools (e.g., LSF, Slurm, PBS).

Experience with distributed file systems (Lustre, Ceph, GPFS) with 5 years’ experience.

#J-18808-Ljbffr