Selby Jennings
This top trading firm is looking to bring on an HPC Network Engineer to join it's team in Dallas!
What You'll Be Doing
Architect and oversee a high-throughput compute environment Expand and optimize infrastructure to support growing technical demands Manage a bare-metal provisioning stack, with emphasis on OpenStack Ironic Continuously monitor system health and implement performance improvements Establish and refine operational procedures to reduce downtime and hardware faults Conduct diagnostics, performance tuning, and capacity forecasting Review and enhance hardware lifecycle workflows Collaborate across teams to align infrastructure with broader technical goals Apply security best practices to hardware and platform-level systems Guide and mentor junior team members, fostering a culture of technical growth What You Bring
Hands-on experience managing complex HPC environments at scale In-depth understanding of server architecture, including compute, memory, storage, and networking components Strong background in bare-metal provisioning and infrastructure-as-code practices Proven ability to troubleshoot and resolve hardware issues in production environments Familiarity with automation frameworks such as Ansible, Puppet, or Chef Experience with out-of-band management tools and APIs (e.g., Redfish, iDRAC, iLO, BMC, IPMI) Skills in system tuning, diagnostics, and capacity planning Knowledge of thermal and power efficiency in data center environments Awareness of hardware-level security practices Strong analytical and communication skills, with a collaborative mindset Bonus Points For
Experience in hyperscale or large compute cluster environments Knowledge of high-speed networking technologies (e.g., InfiniBand, 100GbE) Familiarity with Linux systems and scripting languages (Python, Bash, PowerShell) Exposure to OpenStack or similar cloud infrastructure platforms Experience with GPU management tools and debugging (e.g., NVIDIA-SMI) Prior leadership experience in mentoring or managing technical teams
What You'll Be Doing
Architect and oversee a high-throughput compute environment Expand and optimize infrastructure to support growing technical demands Manage a bare-metal provisioning stack, with emphasis on OpenStack Ironic Continuously monitor system health and implement performance improvements Establish and refine operational procedures to reduce downtime and hardware faults Conduct diagnostics, performance tuning, and capacity forecasting Review and enhance hardware lifecycle workflows Collaborate across teams to align infrastructure with broader technical goals Apply security best practices to hardware and platform-level systems Guide and mentor junior team members, fostering a culture of technical growth What You Bring
Hands-on experience managing complex HPC environments at scale In-depth understanding of server architecture, including compute, memory, storage, and networking components Strong background in bare-metal provisioning and infrastructure-as-code practices Proven ability to troubleshoot and resolve hardware issues in production environments Familiarity with automation frameworks such as Ansible, Puppet, or Chef Experience with out-of-band management tools and APIs (e.g., Redfish, iDRAC, iLO, BMC, IPMI) Skills in system tuning, diagnostics, and capacity planning Knowledge of thermal and power efficiency in data center environments Awareness of hardware-level security practices Strong analytical and communication skills, with a collaborative mindset Bonus Points For
Experience in hyperscale or large compute cluster environments Knowledge of high-speed networking technologies (e.g., InfiniBand, 100GbE) Familiarity with Linux systems and scripting languages (Python, Bash, PowerShell) Exposure to OpenStack or similar cloud infrastructure platforms Experience with GPU management tools and debugging (e.g., NVIDIA-SMI) Prior leadership experience in mentoring or managing technical teams