Diverse Lynx
Title: HPC Architect
Chicago, IL (Onsite/Remote)
Contractual requirement
Role And Responsibilities: This "HPC Architect" position is responsible for designs and implements HPC practices within an organization. Competencies:
HPC AI System Management
Below is the list of principal responsibilities: System Design and Implementation: HPC Engineers design, build, and configure HPC clusters, including hardware and software components. System Administration: They manage and maintain the HPC infrastructure, including operating systems, storage, and networking. Performance Optimization: They analyze system performance, identify bottlenecks, and implement solutions to optimize performance for various applications. Troubleshooting and Support: They diagnose and resolve issues with the HPC system, providing support to researchers and users. Scripting and Automation: They develop scripts and automation tools to streamline routine tasks and improve efficiency. Collaboration: They work closely with researchers, scientists, and other engineers to understand their needs and provide effective solutions. Documentation: They maintain clear and accurate documentation of system configurations, procedures, and troubleshooting steps. Monitoring and Maintenance: They monitor system health, perform maintenance tasks, and plan for upgrades and new technologies. Security: They ensure the secure and effective operation of HPC systems. Qualifications: Science Graduate (4 Years Degree)
Preferred Skills: Linux Systems: Strong understanding of Linux operating systems and environment. Cluster Management Software: Familiarity with cluster management software like Slurm, PBS, or LSF. Scripting: Proficiency in scripting languages like Python or Bash. High-Performance Computing Concepts: Understanding of HPC architectures, parallel computing, and related technologies. Troubleshooting: Ability to diagnose and resolve complex technical issues. Communication: Strong verbal and written communication skills. Collaboration: Ability to work effectively with diverse teams and individuals. Cloud Technologies: Knowledge of cloud solutions like AWS, GCP, or Azure (increasingly relevant). Infrastructure as Code (IaC): Experience with IaC tools like Ansible or Terraform (increasingly relevant). Scientific Computing: Familiarity with scientific and engineering applications and workflows (depending on the specific role). GPU Usage: Experience with GPU usage in compute clusters and CUDA (if applicable). Resource would need to be 100% onsite (as guidelines allow), with some work from home flexibility offered.
Diverse Lynx LLC is an Equal Employment Opportunity employer. All qualified applicants will receive due consideration for employment without any discrimination. All applicants will be evaluated solely on the basis of their ability, competence and their proven capability to perform the functions outlined in the corresponding role. We promote and support a diverse workforce across all levels in the company.
Role And Responsibilities: This "HPC Architect" position is responsible for designs and implements HPC practices within an organization. Competencies:
HPC AI System Management
Below is the list of principal responsibilities: System Design and Implementation: HPC Engineers design, build, and configure HPC clusters, including hardware and software components. System Administration: They manage and maintain the HPC infrastructure, including operating systems, storage, and networking. Performance Optimization: They analyze system performance, identify bottlenecks, and implement solutions to optimize performance for various applications. Troubleshooting and Support: They diagnose and resolve issues with the HPC system, providing support to researchers and users. Scripting and Automation: They develop scripts and automation tools to streamline routine tasks and improve efficiency. Collaboration: They work closely with researchers, scientists, and other engineers to understand their needs and provide effective solutions. Documentation: They maintain clear and accurate documentation of system configurations, procedures, and troubleshooting steps. Monitoring and Maintenance: They monitor system health, perform maintenance tasks, and plan for upgrades and new technologies. Security: They ensure the secure and effective operation of HPC systems. Qualifications: Science Graduate (4 Years Degree)
Preferred Skills: Linux Systems: Strong understanding of Linux operating systems and environment. Cluster Management Software: Familiarity with cluster management software like Slurm, PBS, or LSF. Scripting: Proficiency in scripting languages like Python or Bash. High-Performance Computing Concepts: Understanding of HPC architectures, parallel computing, and related technologies. Troubleshooting: Ability to diagnose and resolve complex technical issues. Communication: Strong verbal and written communication skills. Collaboration: Ability to work effectively with diverse teams and individuals. Cloud Technologies: Knowledge of cloud solutions like AWS, GCP, or Azure (increasingly relevant). Infrastructure as Code (IaC): Experience with IaC tools like Ansible or Terraform (increasingly relevant). Scientific Computing: Familiarity with scientific and engineering applications and workflows (depending on the specific role). GPU Usage: Experience with GPU usage in compute clusters and CUDA (if applicable). Resource would need to be 100% onsite (as guidelines allow), with some work from home flexibility offered.
Diverse Lynx LLC is an Equal Employment Opportunity employer. All qualified applicants will receive due consideration for employment without any discrimination. All applicants will be evaluated solely on the basis of their ability, competence and their proven capability to perform the functions outlined in the corresponding role. We promote and support a diverse workforce across all levels in the company.