eTeam
Job Description:
Key Responsibilities:
Server Installation & Configuration:
Install, configure, and deploy servers in data center environments, ensuring they are correctly set up for optimal performance and scalability. Hardware Maintenance:
Perform regular maintenance and health checks on servers, including monitoring hardware performance, updating firmware, and replacing or upgrading components. Troubleshooting & Repairs:
Diagnose and resolve hardware and software issues related to the servers, ensuring minimal downtime and maintaining system integrity. Performance Optimization:
Monitor server performance and implement corrective actions to optimize hardware's efficiency, stability, and reliability. System Updates & Patches:
Apply firmware updates, patches, and drivers to NVIDIA servers, ensuring compatibility with the latest software and hardware environments. Integration Support:
Help integrate NVIDIA GB200 servers with other systems and software, ensuring compatibility and smooth communication across the network. Documentation & Reporting:
Maintain accurate records of server configurations, maintenance schedules, and troubleshooting efforts. Generate regular reports on server health, performance, and issues. Collaboration:
Work closely with IT infrastructure teams, network engineers, and other technical staff to ensure seamless server operations and integration with existing infrastructure. Data Center Operations:
Support data center operations, ensuring that NVIDIA servers are properly rack-mounted, cabled, and positioned for optimal airflow and cooling.
Required Skills and Qualifications: Bachelors degree/High School Diploma. Proven experience working with servers or similar high-performance computing hardware. Strong understanding of server hardware, including CPU, memory, storage, networking components, and cooling systems. Solid understanding of networking concepts, protocols, and configurations (TCP/IP, DNS, DHCP, etc.). Proficiency with server diagnostics tools and hardware monitoring software. Preferred Qualifications: Experience with NVIDIA-specific hardware and software solutions, including GPUs, CUDA, and other NVIDIA technologies. Familiarity with GPU server configurations and use cases, particularly in AI, machine learning, and high-performance computing environments. Knowledge of server management frameworks like IPMI, iLO, or similar. IT certifications (e.g., CompTIA A+, Cisco CCNA, or similar) are a plus. Familiarity with cloud platforms (AWS, Google Cloud, Azure) and their interaction with on-premises server infrastructure.
Additional Information: Ability to lift heavy hardware components and perform physical installations and repairs in a data center environment. Ability to lift up to 30 pounds regularly. Ability to bend, stoop, crawl, kneel, crouch, reach,
stand for long periods
, and move about production and warehouse facilities. The environment is temperature controlled, but otherwise, it is a typical production environment with loud noises.
Install, configure, and deploy servers in data center environments, ensuring they are correctly set up for optimal performance and scalability. Hardware Maintenance:
Perform regular maintenance and health checks on servers, including monitoring hardware performance, updating firmware, and replacing or upgrading components. Troubleshooting & Repairs:
Diagnose and resolve hardware and software issues related to the servers, ensuring minimal downtime and maintaining system integrity. Performance Optimization:
Monitor server performance and implement corrective actions to optimize hardware's efficiency, stability, and reliability. System Updates & Patches:
Apply firmware updates, patches, and drivers to NVIDIA servers, ensuring compatibility with the latest software and hardware environments. Integration Support:
Help integrate NVIDIA GB200 servers with other systems and software, ensuring compatibility and smooth communication across the network. Documentation & Reporting:
Maintain accurate records of server configurations, maintenance schedules, and troubleshooting efforts. Generate regular reports on server health, performance, and issues. Collaboration:
Work closely with IT infrastructure teams, network engineers, and other technical staff to ensure seamless server operations and integration with existing infrastructure. Data Center Operations:
Support data center operations, ensuring that NVIDIA servers are properly rack-mounted, cabled, and positioned for optimal airflow and cooling.
Required Skills and Qualifications: Bachelors degree/High School Diploma. Proven experience working with servers or similar high-performance computing hardware. Strong understanding of server hardware, including CPU, memory, storage, networking components, and cooling systems. Solid understanding of networking concepts, protocols, and configurations (TCP/IP, DNS, DHCP, etc.). Proficiency with server diagnostics tools and hardware monitoring software. Preferred Qualifications: Experience with NVIDIA-specific hardware and software solutions, including GPUs, CUDA, and other NVIDIA technologies. Familiarity with GPU server configurations and use cases, particularly in AI, machine learning, and high-performance computing environments. Knowledge of server management frameworks like IPMI, iLO, or similar. IT certifications (e.g., CompTIA A+, Cisco CCNA, or similar) are a plus. Familiarity with cloud platforms (AWS, Google Cloud, Azure) and their interaction with on-premises server infrastructure.
Additional Information: Ability to lift heavy hardware components and perform physical installations and repairs in a data center environment. Ability to lift up to 30 pounds regularly. Ability to bend, stoop, crawl, kneel, crouch, reach,
stand for long periods
, and move about production and warehouse facilities. The environment is temperature controlled, but otherwise, it is a typical production environment with loud noises.