poolside
Director of Network Infrastructure and Capacity
About Poolside
Poolside exists to be this company – to build a world where AI will be the engine behind economically valuable work and scientific progress.
About The Role The Director of Network Infrastructure and Capacity will lead the engineering, planning, deployment, and optimization of network systems to support AI‑centric datacenter operations. This role requires a forward‑looking vision, deep technical expertise, and strong leadership to ensure ultra‑low latency, high‑bandwidth, and scalable network performance across large‑scale AI clusters.
Responsibilities
AI Workload‑Ready Design
Architect high‑throughput, low‑latency networking fabrics (e.g., InfiniBand, RoCE, NVLink over Ethernet) to support distributed AI training and inference at scale.
Develop strategies for network scalability to handle massive east‑west traffic, model checkpointing, and inter‑GPU communication.
Capacity Planning & Optimization
Forecast and plan for exponential growth in compute and storage traffic.
Design capacity models that balance throughput, resiliency, and energy efficiency.
Inference Ingress Bandwidth Management
Engineer external connectivity strategies to support large‑scale inference services with high ingress bandwidth demands.
Optimize north‑south traffic flows to handle millions of concurrent API calls, streaming requests, and customer‑facing inference workloads.
Deploy and manage multi‑terabit ingress points, global load balancing, and edge integration to minimize latency for user‑facing applications.
Collaborate with service teams to ensure predictable ingress performance for real‑time AI/ML products, including LLMs, multimodal inference, and recommendation systems.
Deployment & Operations
Oversee structured cabling, spine‑leaf architectures, optical networking, and high‑density interconnects optimized for AI clusters.
Ensure uptime, low jitter, and predictable latency across multi‑terabit fabrics.
Collaborate with datacenter operations on cooling, power, and space requirements tied to network growth.
Advanced Network Engineering
Integrate software‑defined networking (SDN) and network automation for rapid scaling and operational efficiency.
Implement telemetry, real‑time observability, and AI‑driven monitoring tools to optimize traffic flows.
Stay ahead of new standards in AI networking (e.g., Ethernet AI Consortium, OpenAI fabric blueprints, NVIDIA/AMD interconnects).
Security & Compliance
Ensure robust segmentation, encryption, and compliance for sensitive AI workloads.
Partner with cybersecurity to build zero‑trust architectures for AI clusters.
Leadership & Vendor Engagement
Lead and grow a high‑performing team of network engineers.
Manage strategic vendor relationships for next‑gen switches, optics, and interconnect technologies.
Represent the organization in industry groups and forums shaping the future of AI datacenter networking.
Skills & Experience
Education:
Bachelor’s or Master’s degree in Computer Science, Electrical Engineering, Network Engineering, or related field.
Experience:
10+ years in datacenter networking or infrastructure engineering.
3+ years in a leadership role overseeing large‑scale, high‑performance networking environments.
Direct experience with AI/ML cluster networking strongly preferred.
Technical Skills:
Expertise in routing, switching, and high‑performance fabrics (InfiniBand, RoCE, 400/800G Ethernet).
Experience with datacenter networking (Cisco, Juniper, Arista, NVIDIA/Mellanox).
Deep understanding of cloud networking, SDN, and network automation.
Familiarity with optical networking, DWDM, and hyperscale interconnects.
Soft Skills:
Strategic thinker with ability to translate AI growth into network design.
Strong leadership and cross‑functional collaboration.
Excellent communication and stakeholder engagement.
Preferred Certifications:
CCIE, JNCIE, or equivalent advanced networking certifications.
Process
Intro call with Lance, our VP of Data Center
Interview with Pierre‑Yves, Founding Engineer
Interview with Eiso, our Co‑CEO
Team fit call with the People team
Benefits
Hybrid work & flexible hours
37 days/year of vacation & holidays
Health insurance allowance for you and dependents
Company‑provided equipment
#J-18808-Ljbffr
About The Role The Director of Network Infrastructure and Capacity will lead the engineering, planning, deployment, and optimization of network systems to support AI‑centric datacenter operations. This role requires a forward‑looking vision, deep technical expertise, and strong leadership to ensure ultra‑low latency, high‑bandwidth, and scalable network performance across large‑scale AI clusters.
Responsibilities
AI Workload‑Ready Design
Architect high‑throughput, low‑latency networking fabrics (e.g., InfiniBand, RoCE, NVLink over Ethernet) to support distributed AI training and inference at scale.
Develop strategies for network scalability to handle massive east‑west traffic, model checkpointing, and inter‑GPU communication.
Capacity Planning & Optimization
Forecast and plan for exponential growth in compute and storage traffic.
Design capacity models that balance throughput, resiliency, and energy efficiency.
Inference Ingress Bandwidth Management
Engineer external connectivity strategies to support large‑scale inference services with high ingress bandwidth demands.
Optimize north‑south traffic flows to handle millions of concurrent API calls, streaming requests, and customer‑facing inference workloads.
Deploy and manage multi‑terabit ingress points, global load balancing, and edge integration to minimize latency for user‑facing applications.
Collaborate with service teams to ensure predictable ingress performance for real‑time AI/ML products, including LLMs, multimodal inference, and recommendation systems.
Deployment & Operations
Oversee structured cabling, spine‑leaf architectures, optical networking, and high‑density interconnects optimized for AI clusters.
Ensure uptime, low jitter, and predictable latency across multi‑terabit fabrics.
Collaborate with datacenter operations on cooling, power, and space requirements tied to network growth.
Advanced Network Engineering
Integrate software‑defined networking (SDN) and network automation for rapid scaling and operational efficiency.
Implement telemetry, real‑time observability, and AI‑driven monitoring tools to optimize traffic flows.
Stay ahead of new standards in AI networking (e.g., Ethernet AI Consortium, OpenAI fabric blueprints, NVIDIA/AMD interconnects).
Security & Compliance
Ensure robust segmentation, encryption, and compliance for sensitive AI workloads.
Partner with cybersecurity to build zero‑trust architectures for AI clusters.
Leadership & Vendor Engagement
Lead and grow a high‑performing team of network engineers.
Manage strategic vendor relationships for next‑gen switches, optics, and interconnect technologies.
Represent the organization in industry groups and forums shaping the future of AI datacenter networking.
Skills & Experience
Education:
Bachelor’s or Master’s degree in Computer Science, Electrical Engineering, Network Engineering, or related field.
Experience:
10+ years in datacenter networking or infrastructure engineering.
3+ years in a leadership role overseeing large‑scale, high‑performance networking environments.
Direct experience with AI/ML cluster networking strongly preferred.
Technical Skills:
Expertise in routing, switching, and high‑performance fabrics (InfiniBand, RoCE, 400/800G Ethernet).
Experience with datacenter networking (Cisco, Juniper, Arista, NVIDIA/Mellanox).
Deep understanding of cloud networking, SDN, and network automation.
Familiarity with optical networking, DWDM, and hyperscale interconnects.
Soft Skills:
Strategic thinker with ability to translate AI growth into network design.
Strong leadership and cross‑functional collaboration.
Excellent communication and stakeholder engagement.
Preferred Certifications:
CCIE, JNCIE, or equivalent advanced networking certifications.
Process
Intro call with Lance, our VP of Data Center
Interview with Pierre‑Yves, Founding Engineer
Interview with Eiso, our Co‑CEO
Team fit call with the People team
Benefits
Hybrid work & flexible hours
37 days/year of vacation & holidays
Health insurance allowance for you and dependents
Company‑provided equipment
#J-18808-Ljbffr