Logo
Icahn School of Medicine at Mount Sinai

Senior High Performance Computing System Administrator

Icahn School of Medicine at Mount Sinai, New York, New York, us, 10261

Save Job

Senior HPC Administrator

The Scientific Computing and Data group at the Icahn School of Medicine at Mount Sinai supports a cutting-edge high-performance computing and data ecosystem for researchers, including the HPC systems, clinical research databases, and software development infrastructure. The Senior HPC Administrator is responsible for a computational and data science ecosystem that enables Mount Sinais scientific and clinical goals, combining technical expertise with strong customer service for researchers. This role reports to the Director for Computational & Data Ecosystem in Scientific Computing and leads projects to completion with little to no supervision. Base pay range

$120,000.00/yr - $180,060.00/yr Responsibilities Design, deploy and maintain the scientific computing ecosystem, including ~30,000 cores with high bandwidth, low latency interconnects, GPUs, large shared memory nodes, databases, scientific workflows and 30+ petabytes of storage in production, clinical data warehouse and software development environment. Lead troubleshooting, isolation and resolution of technical issues (application, system, hardware, software, and network); actively monitor systems. Maintain, tune and manage computational, data, cloud technologies and workflow systems; define and deploy a comprehensive computational and data vision. Communicate system advantages/disadvantages and tradeoffs. Design, develop and implement system administration tasks including hardware/software configuration, configuration management, system monitoring (regression tests), usage reporting, performance, security, networking and metrics. Collaborate with research and hospital IT, compliance, HIPAA, security and other departments to ensure regulatory and policy compliance. Integrate HPC resources with laboratory equipment and data resources; link data and compute resources. Research, deploy and optimize resource management and scheduling software; design, tune, manage and upgrade parallel file systems, storage and data-oriented resources. Develop and manage security infrastructure, including policies and procedures. Maintain HPC operations following best practices; develop and implement backup policies. Prepare and manage budgets for hardware, software and maintenance; participate in chargeback/fee recovery analyses and provide operational sustainability recommendations. Contribute to system design for research proposals; create and maintain clear documentation. Work effectively with team members and across Mount Sinai; provide after-hours support for critical system and production issues; respond to user tickets.

Qualifications

Bachelor's degree in computer science, engineering or a related scientific field; Masters or PhD preferred 8+ years (higher preferred) of progressive HPC system administration and operations (Redhat/CentOS Linux, Batch HPC cluster experience) Expert troubleshooter; strong teamwork and customer-focused mindset Experience with job schedulers such as LSF or Slurm and with parallel file systems and storage Experience with networking and security Experience with configuration management systems (xCAT, Puppet and/or Ansible) Experience with databases and web services; Infiniband and Ethernet networking Experience in an academic or research environment Scripting and programming experience; experience with cloud computing Ability to multitask in a dynamic environment; strong communication, analytical and leadership skills

Preferred Experience

Advanced degree GPFS, LSF, TSM, IB and Ethernet networking experience Databases and web services experience

Equal Opportunity Employer

The Mount Sinai Health System is an equal opportunity employer, complying with all applicable federal civil rights laws. We do not discriminate, exclude, or treat individuals differently based on race, color, national origin, age, religion, disability, sex, sexual orientation, gender, veteran status, or any other characteristic protected by law. Mt. Sinai Health System is committed to fostering an environment where all faculty, staff, students, trainees, patients, and communities feel respected and supported, with opportunities for growth and development for all. #J-18808-Ljbffr