Johns Hopkins University
HPC Scientific Software Director (IT@JH Research Computing)
Johns Hopkins University, Baltimore, Maryland, United States
Scope of Authority
One or more major and highly complex/technical IT functions (e.g., networking, telecommunications, applications, Web services, etc.) that significantly impact operations and support the entire university, health system, or both.
All IT functions of a large school/division of the university and health system that require a diverse and highly complex IT environment involving several highly technical functions, e.g., complex application development, networking, systems, etc.
Responsibilities Typically has responsibility for a multi‑million‑dollar budget including both capital and operating expenses. Technology and business decisions made within this organization are highly complex and must take into account the various base of products and services supported across the organization in order to ensure appropriate integration. Typically has a large staff of direct and indirect reports of managers and staff of 20 or more.
Develop technology solutions to anticipate the organization’s needs, be cost‑effective, reliable and compatible with existing and emerging technologies. Provide input to, and is responsible for, ongoing operations, budgets, a multi‑year budget forecast and both strategic and operational planning. Work with constituents in conjunction with other IT leaders to interpret customer business needs and make recommendations for strategic investments in technology, applications, business process, personnel, etc. that meet the agreed‑upon goals of the organization. Provide guidance for the development of technology‑related policies and procedures and represents IT on business‑driven policy committees within and outside of Hopkins. Ensure that applicable Hopkins policies, practices, regulatory requirements are addressed and followed within his/her area of responsibility.
Manage the customer relationship and satisfaction as well as adherence to the contractual obligations. Facilitate and influence organizational strategic initiatives to achieve mission and organizational goals. Has direct responsibility for the design, development, and application of technical solutions that satisfy customer needs and are essential to the organization’s ongoing operations. Ensure continuous delivery of information technology support and services through direct management of service level agreements. Develop and implement an effective and efficient organizational structure that, within the bounds of its responsibilities, supports the ongoing operations of the organization. Perform other related duties as requested.
Lead the software engineering team in building scalable, reproducible, and automated HPC and AI software environments. Architect the software stack across multiple clusters, including compilers, libraries, scientific applications, AI/ML frameworks, containers, modules, and workflow orchestration systems. Oversee the development and maintenance of automation systems for software deployment, configuration management, CI/CD, and environment lifecycle processes. Partner with researchers and domain experts to optimize applications for CPU/GPU architectures, parallel execution, and distributed training or simulation workloads. Ensure high reliability of research workflows through robust monitoring, logging, and performance analysis systems. Guide the integration of emerging technologies—new GPU platforms, distributed compute frameworks, data processing engines—into production environments. Establish coding standards, documentation practices, and reproducibility guidelines for software delivered by the team. Lead strategic planning for the software ecosystem, defining technical roadmaps aligned with institutional research priorities. Collaborate with systems engineering teams to ensure software and hardware designs evolve cohesively. Manage team capacity, mentorship, project planning, vendor engagements, and cross‑functional initiatives. Serve as the senior technical authority for software‑related incidents, upgrades, and performance challenges. Foster a culture of innovation, experimentation, and high‑quality engineering within the Research Computing software organization. This role provides direct supervision and strategic oversight for the Research Computing software engineering team, including Sr. HPC Software Engineers, Sr. Scientific Software Engineers, HPC Software Engineers, Application and User Support Specialists.
Qualifications • Bachelor’s Degree. • Ten years of progressively responsible IT management experience including five years of management/supervisory experience. Additional education may substitute for required experience and additional related experience may substitute for required education beyond a high school diploma/graduation equivalent, to the extent permitted by the JHU equivalency formula. • Ten plus years of experience in HPC, large‑scale software engineering, or research computing, including hands‑on development of distributed or parallelized scientific applications, workflow automation platforms, or AI/ML tooling. • Five plus years of technical leadership experience, including leading software engineering teams, setting technical direction, and managing complex, multi‑phase R&D or infrastructure projects. • Deep proficiency in Python, C/C++, Go, Rust, or equivalent languages, with experience optimizing code for parallel, multi‑node, or GPU‑accelerated execution. • Expertise with HPC and AI software stacks including MPI, CUDA, OpenMP, ROCm, AI/ML frameworks, and distributed computing libraries (Dask, Ray, Horovod). Strong experience designing, deploying, and maintaining reproducible research environments using Spack, Lmod, Apptainer/Singularity, and containerized workflows. • Demonstrated ability to architect CI/CD pipelines, software lifecycle processes, and automation frameworks for large‑scale research software deployments. • Familiarity with workflow engines (Nextflow, Snakemake), data pipelines, and software systems supporting large‑volume analytics and scientific simulation at scale. • Proven success building and maintaining collaborative relationships with faculty and research groups, translating scientific requirements into actionable engineering plans. • Strong communication and documentation skills, with the ability to lead technical initiatives while mentoring staff and fostering a culture of quality, reproducibility, and innovation. • Experience contributing to strategic planning, budgeting, procurement, and lifecycle management for research software infrastructure.
Johns Hopkins University
#J-18808-Ljbffr
One or more major and highly complex/technical IT functions (e.g., networking, telecommunications, applications, Web services, etc.) that significantly impact operations and support the entire university, health system, or both.
All IT functions of a large school/division of the university and health system that require a diverse and highly complex IT environment involving several highly technical functions, e.g., complex application development, networking, systems, etc.
Responsibilities Typically has responsibility for a multi‑million‑dollar budget including both capital and operating expenses. Technology and business decisions made within this organization are highly complex and must take into account the various base of products and services supported across the organization in order to ensure appropriate integration. Typically has a large staff of direct and indirect reports of managers and staff of 20 or more.
Develop technology solutions to anticipate the organization’s needs, be cost‑effective, reliable and compatible with existing and emerging technologies. Provide input to, and is responsible for, ongoing operations, budgets, a multi‑year budget forecast and both strategic and operational planning. Work with constituents in conjunction with other IT leaders to interpret customer business needs and make recommendations for strategic investments in technology, applications, business process, personnel, etc. that meet the agreed‑upon goals of the organization. Provide guidance for the development of technology‑related policies and procedures and represents IT on business‑driven policy committees within and outside of Hopkins. Ensure that applicable Hopkins policies, practices, regulatory requirements are addressed and followed within his/her area of responsibility.
Manage the customer relationship and satisfaction as well as adherence to the contractual obligations. Facilitate and influence organizational strategic initiatives to achieve mission and organizational goals. Has direct responsibility for the design, development, and application of technical solutions that satisfy customer needs and are essential to the organization’s ongoing operations. Ensure continuous delivery of information technology support and services through direct management of service level agreements. Develop and implement an effective and efficient organizational structure that, within the bounds of its responsibilities, supports the ongoing operations of the organization. Perform other related duties as requested.
Lead the software engineering team in building scalable, reproducible, and automated HPC and AI software environments. Architect the software stack across multiple clusters, including compilers, libraries, scientific applications, AI/ML frameworks, containers, modules, and workflow orchestration systems. Oversee the development and maintenance of automation systems for software deployment, configuration management, CI/CD, and environment lifecycle processes. Partner with researchers and domain experts to optimize applications for CPU/GPU architectures, parallel execution, and distributed training or simulation workloads. Ensure high reliability of research workflows through robust monitoring, logging, and performance analysis systems. Guide the integration of emerging technologies—new GPU platforms, distributed compute frameworks, data processing engines—into production environments. Establish coding standards, documentation practices, and reproducibility guidelines for software delivered by the team. Lead strategic planning for the software ecosystem, defining technical roadmaps aligned with institutional research priorities. Collaborate with systems engineering teams to ensure software and hardware designs evolve cohesively. Manage team capacity, mentorship, project planning, vendor engagements, and cross‑functional initiatives. Serve as the senior technical authority for software‑related incidents, upgrades, and performance challenges. Foster a culture of innovation, experimentation, and high‑quality engineering within the Research Computing software organization. This role provides direct supervision and strategic oversight for the Research Computing software engineering team, including Sr. HPC Software Engineers, Sr. Scientific Software Engineers, HPC Software Engineers, Application and User Support Specialists.
Qualifications • Bachelor’s Degree. • Ten years of progressively responsible IT management experience including five years of management/supervisory experience. Additional education may substitute for required experience and additional related experience may substitute for required education beyond a high school diploma/graduation equivalent, to the extent permitted by the JHU equivalency formula. • Ten plus years of experience in HPC, large‑scale software engineering, or research computing, including hands‑on development of distributed or parallelized scientific applications, workflow automation platforms, or AI/ML tooling. • Five plus years of technical leadership experience, including leading software engineering teams, setting technical direction, and managing complex, multi‑phase R&D or infrastructure projects. • Deep proficiency in Python, C/C++, Go, Rust, or equivalent languages, with experience optimizing code for parallel, multi‑node, or GPU‑accelerated execution. • Expertise with HPC and AI software stacks including MPI, CUDA, OpenMP, ROCm, AI/ML frameworks, and distributed computing libraries (Dask, Ray, Horovod). Strong experience designing, deploying, and maintaining reproducible research environments using Spack, Lmod, Apptainer/Singularity, and containerized workflows. • Demonstrated ability to architect CI/CD pipelines, software lifecycle processes, and automation frameworks for large‑scale research software deployments. • Familiarity with workflow engines (Nextflow, Snakemake), data pipelines, and software systems supporting large‑volume analytics and scientific simulation at scale. • Proven success building and maintaining collaborative relationships with faculty and research groups, translating scientific requirements into actionable engineering plans. • Strong communication and documentation skills, with the ability to lead technical initiatives while mentoring staff and fostering a culture of quality, reproducibility, and innovation. • Experience contributing to strategic planning, budgeting, procurement, and lifecycle management for research software infrastructure.
Johns Hopkins University
#J-18808-Ljbffr