Match Point Solutions
Principal Network Engineer-DC & AI clusters
Match Point Solutions, Chandler, Arizona, United States, 85249
MatchPoint Solutions is a fast-growing, young, energetic global IT-Engineering services company with clients across the US. We provide technology solutions to various clients like Uber, Robinhood, Netflix, Airbnb, Google, Sephora, and more! More recently, we have expanded to working internationally in Canada, China, Ireland, UK, Brazil, and India. Through our culture of innovation, we inspire, build, and deliver business results, from idea to outcome. We keep our clients on the cutting edge of the latest technologies and provide solutions by using industry-specific best practices and expertise.
We are excited to be continuously expanding our team. If you are interested in this position, please send over your updated resume. We look forward to hearing from you!
This is a large impact role that demands someone who can excel in a fast paced and ever changing environment.
Principal N / W engineer
US- remote
140-150 / hr ( DOE)
What You'll Do
We are seeking a highly skilled Principal Network Engineer to join our dynamic team to build the next generation of IT AI Clusters and help lead the team through a major technology transformation into running AI on-prem and build infrastructure by integrating Enterprise ready platforms while building a solid foundation with automation. We are looking for a passionate engineer who will solve networking problems for scalable AI clusters. This is a hands-on network engineering position focused on the architecture, design, development and deployment of ultra-high-speed, resilient, and scalable DC AI Clusters and Interconnects for GPU-accelerated data centers and compute clusters. Outstanding problem-solving abilities and a comprehensive understanding of the network security protocols & standards, routing, switching, automation and deep understanding of fundamental network theory is also critical to your success . What You Will Be Doing
Lead the architecture, design, and deployment of global-scale DCs inter-connects and fabric for HPC, AI, and GPU computing clusters. Develop high-performance data center fabric using InfiniBand, Ultra Ethernet and related technologies. Optimize carrier interconnects, intra and inter DC routing, and dark fiber deployments to ensure low latency and high reliability. Partner with system, OS, GPU, and HPC teams to deliver scalable, highly available networks for extreme-performance workloads. Implement network monitoring, telemetry, solving, and continuous performance improvement processes. Drive technology selection, vendor engagement, and lifecycle management for Data Center hardware and software. Collaborate with internal product managers What We Need To See
MS or PhD in Electrical Engineering, Computer Science, Computer Engineering, Artificial Intelligence, Data Science, Mathematics, Statistics, or equivalent experience. 12+ years of experience in building, managing and supporting large scale hybrid networks, developing automation pipelines with Python, Ruby, Go or other languages used in infrastructure automation. Expert in networking technologies : InfiniBand, Ultra Ethernet, ROCEv2, DCQCN, TCP / UDP, IPv4 / IPv6, BGP / MP-BGP, VPN, L2 switching, EVPN, VxLAN, Segment Routing, MPLS. Experience automating network infrastructure Experience using an automated configuration management system (Python,Terraform, Chef, Puppet, Ansible, Salt, etc.) MatchPoint Solutions provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws. This policy applies to all terms and conditions of employment, including recruiting, hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation, and training.
#J-18808-Ljbffr
We are seeking a highly skilled Principal Network Engineer to join our dynamic team to build the next generation of IT AI Clusters and help lead the team through a major technology transformation into running AI on-prem and build infrastructure by integrating Enterprise ready platforms while building a solid foundation with automation. We are looking for a passionate engineer who will solve networking problems for scalable AI clusters. This is a hands-on network engineering position focused on the architecture, design, development and deployment of ultra-high-speed, resilient, and scalable DC AI Clusters and Interconnects for GPU-accelerated data centers and compute clusters. Outstanding problem-solving abilities and a comprehensive understanding of the network security protocols & standards, routing, switching, automation and deep understanding of fundamental network theory is also critical to your success . What You Will Be Doing
Lead the architecture, design, and deployment of global-scale DCs inter-connects and fabric for HPC, AI, and GPU computing clusters. Develop high-performance data center fabric using InfiniBand, Ultra Ethernet and related technologies. Optimize carrier interconnects, intra and inter DC routing, and dark fiber deployments to ensure low latency and high reliability. Partner with system, OS, GPU, and HPC teams to deliver scalable, highly available networks for extreme-performance workloads. Implement network monitoring, telemetry, solving, and continuous performance improvement processes. Drive technology selection, vendor engagement, and lifecycle management for Data Center hardware and software. Collaborate with internal product managers What We Need To See
MS or PhD in Electrical Engineering, Computer Science, Computer Engineering, Artificial Intelligence, Data Science, Mathematics, Statistics, or equivalent experience. 12+ years of experience in building, managing and supporting large scale hybrid networks, developing automation pipelines with Python, Ruby, Go or other languages used in infrastructure automation. Expert in networking technologies : InfiniBand, Ultra Ethernet, ROCEv2, DCQCN, TCP / UDP, IPv4 / IPv6, BGP / MP-BGP, VPN, L2 switching, EVPN, VxLAN, Segment Routing, MPLS. Experience automating network infrastructure Experience using an automated configuration management system (Python,Terraform, Chef, Puppet, Ansible, Salt, etc.) MatchPoint Solutions provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws. This policy applies to all terms and conditions of employment, including recruiting, hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation, and training.
#J-18808-Ljbffr