Logo
CBRE Group, Inc.

Data Center Technical Manager

CBRE Group, Inc., Lemont, Illinois, United States, 60439

Save Job

About the Role: As the

Technical Manager for the Argonne National Laboratory (ANL) Aurora Exascale Supercomputer Facility , you will serve as the primary engineering authority supporting one of the most advanced high‑performance computing (HPC) environments in the world. This role provides specialized oversight of

critical electrical, mechanical, cooling, and facility infrastructure

directly supporting Aurora’s exascale compute capability.

You will act as both:

Site Subject Matter Expert (SME)

for Aurora’s high‑density compute, liquid‑cooling, and MEP systems; and

Owner of Critical Environment Risk Management (CERM)

for the ANL Aurora facility, ensuring the stability, safety, and resilience of the supporting critical infrastructure.

This position requires an advanced engineering skill set, deep infrastructure knowledge, and the ability to operate within an national laboratory environment, supporting DOE mission‑critical workloads and rigorous compliance requirements.

Essential Duties & Responsibilities

Ensure all Aurora infrastructure operations comply with DOE, ANL, CBRE, and vendor technical requirements, including contract‑specific engineering deliverables.

Maintain and govern all critical infrastructure drawings for Aurora, including chilled water distribution, liquid‑cooling loops, electrical one‑lines, and HPC load distribution diagrams.

Develop and maintain detailed site‑specific RACI matrices tailored to Aurora’s HPC operational structure.

Perform staffing analysis to support 24/7 coverage, high‑density cooling management, and HPC system maintenance windows.

Lead creation and updates of business continuity and disaster recovery plans specific to the high‑performance computing environment.

Implement training and qualification programs for critical facilities engineers supporting Aurora’s unique cooling and power systems.

Coordinate mock failure scenarios (e.g., chiller plant failures, power transitions, liquid‑cooling loop disruptions) to validate operational readiness.

Oversee maintenance scheduling to align with Aurora compute job scheduling, minimizing scientific workload interruptions.

Develop and refine operating procedures (SOP/EOP/MOP) specific to exascale HPC cooling, power sequencing, and system change protocols.

Provide incident response support, including contribution to root cause analysis for any facility events that impact HPC availability.

Manage Aurora’s power, cooling, and infrastructure capacity in line with exascale system demands and future technology refresh cycles.

Maintain a site‑specific risk register covering HPC thermal loads, water systems, UPS distribution, and mission interruptions.

Oversee airflow and liquid‑cooling optimization to support Aurora’s extreme compute power density.

Assess lifecycle condition of assets supporting Aurora (UPC, CW pumps, cooling towers, CDUs, heat exchangers) and inform ANL capital planning.

Develop and implement sustainability strategies in accordance with DOE sustainability goals, including water usage, energy efficiency, and PUE/WUE reduction.

Knowledge Operation, maintenance, and repair of data center critical infrastructure, including:

Standby generators, UPS systems, PDUs, ATSs supporting the Aurora supercomputer

Large‑scale chilled water plants, liquid‑cooling distribution units, and plate heat exchangers

CRAHs, CRACs, and Aurora‑specific cooling technologies (e.g., warm‑water loops, rear‑door exchangers)

BMS, EPMS, CMMS, DCIM systems integrated with HPC telemetry systems

Engineering Knowledge of:

Psychrometric charts, HVAC load calculations, and hydronic pipe sizing.

Reading electrical one‑lines, chilled, and condenser water diagrams.

Standard sequences of operation for electrical and mechanical data center systems.

Electrical power calculations per NFPA 70 (NEC), coordination, arc‑flash studies (NFPA 70E), and maintenance practices (NFPA 70B).

Industry standards, including ASHRAE Datacom/TC 9.9 and OCP publications.

Principles of preventative, predictive, and reactive maintenance.

Energy efficiency metrics (e.g., PUE, WUE) and sustainable data center practices.

Skills

Ability to analyze performance data from Aurora’s environmental telemetry, SCADA systems, and HPC monitoring tools.

Strong documentation abilities for highly regulated environments (DOE/ANL).

Proficient in Microsoft Office Suite (Word, Excel, Outlook, PowerPoint, Teams) and Microsoft Power BI for data analysis and reporting.

Proficient in Bluebeam, CAD, and BIM software for technical documentation.

Strong communication, capable of bridging discussions between CBRE, ANL, DOE, and vendor engineering teams.

Analytical and strategic thinking for planning and problem‑solving.

Talents

Analytical: Objective in identifying patterns and root causes through systematic analysis.

Adaptable: Thrives in dynamic environments, managing multiple priorities effectively.

Focused: Maintains clear objectives and filters actions to achieve goals.

Responsible: Takes ownership of commitments and delivers results reliably.

Qualifications

Bachelor’s degree in mechanical, electrical, or related engineering field is preferred

PE licensure a plus, especially with experience in HPC environments or mission‑critical facilities.

3–5 years of experience in data center or HPC facility operations; exascale or national laboratory experience highly desirable.

#J-18808-Ljbffr