Logo
Nextonic Solutions

High-Performance Computing Systems Engineer

Nextonic Solutions, Rockville, Maryland, us, 20849

Save Job

High-Performance Computing Systems Engineer

Nextonic Solutions is seeking a High-Performance Computing (HPC) Systems Engineer to join our vibrant team at the National Institutes of Health (NIH) supporting the National Center for Advancing Translational Sciences (NCATS) located in Rockville, MD. The High-Performance Computing (HPC) Systems Engineer will support the Scientific Computing and Informatics (SCI) team at the National Center for Advancing Translational Sciences (NCATS). This role will focus on the design, optimization, security, and maintenance of HPC and cloud-based infrastructures that enable cutting-edge biomedical research through scalable, secure, and high-performing computing environments. Responsibilities: Design, configure, and maintain scalable HPC clusters for optimal performance. Support documentation and ATO (Authority to Operate) processes. Ensure infrastructure design compliance with federal security standards and best practices. Implement monitoring tools such as XDMoD for transparency and user reporting. Integrate platforms such as JupyterHub and job schedulers (e.g., Slurm) for improved interactivity. Develop and manage AWS-based infrastructure using Terraform, Packer, and Ansible. Automate deployment workflows to streamline provisioning, updates, and scaling. Manage systems involved in AWS Secure Cloud Bridging (SCB) and STRIDES initiatives. Implement CIS benchmark-aligned system hardening using OpenSCAP. Administer optimized compute images (CPU/GPU) for scientific workflows. Leverage tools such as OpenHPC, Warewulf, and Ansible for environment management. Lead and coordinate quarterly patch cycles. Partner with researchers and external stakeholders on critical projects. Facilitate solution transitions to other NIH centers and collaborators. Contribute to publications and team objectives through deep technical engagement. Qualifications: Federal ATO processes experience required HPC architecture and performance optimization is required Scientific software development and deployment High-speed network and parallel file system architecture Troubleshooting, diagnostics, and technical support Strong communication and multitasking skills Programming & Scripting: Languages - Pascal, BASIC, Delphi, Visual Basic, C, C++ Scripting - Bash, Perl, Python, Ruby, PEAR, Tcl Systems & Network Administration: Linux – RHEL/CentOS, SUSE, Debian, Ubuntu Windows – 95–10; NT–Server 2016 Networking – Active Directory, TCP/IP v4/v6, DHCP, DNS, WINS Legacy – NOVELL 3.1–5, VPN, Citrix, Terminal Services Monitoring & Management Tools: Nagios, Ganglia, HP BAC, Precise i3 SGI SMC, HP PCM, Bright Cluster Manager (incl. Data Analytics) Infrastructure & Automation: Puppet, Cobbler, Ansible, Chef Red Hat Satellite, Kickstart, RPM optimization File Systems & Archiving: Panasas (DirectFlow/panfs), DDN (GPFS), SGI DMF, StorHouse/RFS (Filetek) HPC Tools & Job Scheduling: MOAB/MAUI, Torque, PBS Pro, Windows HPC Scheduler Visualization & Remote Access: Nice DCV, EnginFrame, VNC, OpenText Exceed OnDemand, Web Remote Desktop Containerization & GPU: Docker, Kubernetes, Kubeflow, NVIDIA DGX-1 GPU systems Databases: SQL Server (2000–2008), MySQL, Zope High-Speed Networking: Infiniband, Mellanox, OFED, Voltaire, Force10 Proven experience in: HPC architecture and performance tuning Cybersecurity in HPC/cloud environments Infrastructure as Code (AWS, Terraform, Ansible, Packer) Supporting scientific workflows in research environments

#J-18808-Ljbffr