Aaaipress
Overview
1. Design, implement and advance core HPC systems such as the HPC provisioning system, the resource-management system, account/user lifecycle management, and user authentication and authorization systems. 2. Design, deploy, configure and support HPC clusters, including compute, networking, parallel storage and backup. 3. Install, administer and maintain hardware, system software, networking, accounts, and security measures. 4. Diagnose and correct system issues, whether these be issues with correct operation or performance. 5. Develop and maintain documentation. 6. Research developments in HPC architecture and new technologies, processes, and methodologies. 7. Determine specifications for new systems, and tailor these to meet research needs. Responsibilities
Design, implement and advance core HPC systems such as the HPC provisioning system, the resource-management system, account/user lifecycle management, and user authentication and authorization systems. Design, deploy, configure and support HPC clusters, including compute, networking, parallel storage and backup. Install, administer and maintain hardware, system software, networking, accounts, and security measures. Diagnose and correct system issues, whether these be issues with correct operation or performance. Develop and maintain documentation. Research developments in HPC architecture and new technologies, processes, and methodologies. Determine specifications for new systems, and tailor these to meet research needs. Required Skills / Abilities
Expertise in administration of HPC Linux clusters, including managing and configuring cluster provisioning and management tools, and batch scheduler. Experience with high-speed networking such as InfiniBand and high-speed Ethernet. Experience with large storage systems and parallel file systems such as GPFS and Lustre. Expertise in Linux system administration, including managing the operating system, networking, storage, and security. Expertise in automation and scripting in at least one scripting language. Ability to work in a team environment in a fast moving technology field. Excellent verbal and writing skills. Ability to interact well with team members and end users. Ability to work independently and across units. Attention to detail. Ability to take the care necessary to be entrusted with a system that hundreds of users depend on for research computation and the storage of research data. Preferred Education, Experience and Skills
Experience with GPUs. Ability to specify new systems especially for AI and ML. Experience configuring, deploying, supporting large-scale systems in a research environment. Expertise in computer security in large, multi-user Linux environments. Experience with remote admin, installing and trouble-shooting hardware. Expertise securing large Linux environments. Work Week
Standard (M-F equal number of hours per day) Posting Information
Posting Position Title: Senior High Performance Computing Administrator University Job Title: Senior High Performance Computing Administrator Preferred Education, Experience and Skills (Summary)
Experience with GPUs. Ability to specify new systems especially for AI and ML. Experience configuring, deploying, supporting large-scale systems in a research environment. Expertise in computer security in large, multi-user Linux environments. Experience with remote admin, installing and trouble-shooting hardware. Expertise securing large Linux environments. Bachelor’s Degree in a related field and a minimum of six years of related work experience or an equivalent combination of education and experience.
#J-18808-Ljbffr
1. Design, implement and advance core HPC systems such as the HPC provisioning system, the resource-management system, account/user lifecycle management, and user authentication and authorization systems. 2. Design, deploy, configure and support HPC clusters, including compute, networking, parallel storage and backup. 3. Install, administer and maintain hardware, system software, networking, accounts, and security measures. 4. Diagnose and correct system issues, whether these be issues with correct operation or performance. 5. Develop and maintain documentation. 6. Research developments in HPC architecture and new technologies, processes, and methodologies. 7. Determine specifications for new systems, and tailor these to meet research needs. Responsibilities
Design, implement and advance core HPC systems such as the HPC provisioning system, the resource-management system, account/user lifecycle management, and user authentication and authorization systems. Design, deploy, configure and support HPC clusters, including compute, networking, parallel storage and backup. Install, administer and maintain hardware, system software, networking, accounts, and security measures. Diagnose and correct system issues, whether these be issues with correct operation or performance. Develop and maintain documentation. Research developments in HPC architecture and new technologies, processes, and methodologies. Determine specifications for new systems, and tailor these to meet research needs. Required Skills / Abilities
Expertise in administration of HPC Linux clusters, including managing and configuring cluster provisioning and management tools, and batch scheduler. Experience with high-speed networking such as InfiniBand and high-speed Ethernet. Experience with large storage systems and parallel file systems such as GPFS and Lustre. Expertise in Linux system administration, including managing the operating system, networking, storage, and security. Expertise in automation and scripting in at least one scripting language. Ability to work in a team environment in a fast moving technology field. Excellent verbal and writing skills. Ability to interact well with team members and end users. Ability to work independently and across units. Attention to detail. Ability to take the care necessary to be entrusted with a system that hundreds of users depend on for research computation and the storage of research data. Preferred Education, Experience and Skills
Experience with GPUs. Ability to specify new systems especially for AI and ML. Experience configuring, deploying, supporting large-scale systems in a research environment. Expertise in computer security in large, multi-user Linux environments. Experience with remote admin, installing and trouble-shooting hardware. Expertise securing large Linux environments. Work Week
Standard (M-F equal number of hours per day) Posting Information
Posting Position Title: Senior High Performance Computing Administrator University Job Title: Senior High Performance Computing Administrator Preferred Education, Experience and Skills (Summary)
Experience with GPUs. Ability to specify new systems especially for AI and ML. Experience configuring, deploying, supporting large-scale systems in a research environment. Expertise in computer security in large, multi-user Linux environments. Experience with remote admin, installing and trouble-shooting hardware. Expertise securing large Linux environments. Bachelor’s Degree in a related field and a minimum of six years of related work experience or an equivalent combination of education and experience.
#J-18808-Ljbffr