Logo
SK hynix America

AI/ML Computing Cluster Engineer

SK hynix America, San Jose, California, United States, 95199

Save Job

About SK hynix America

At SK hynix America, we’re at the forefront of semiconductor innovation, developing advanced memory solutions that power everything from smartphones to data centers. As a global leader in DRAM and NAND flash technologies, we drive the evolution of mobile technology, empower cloud computing, and pioneer future technologies. Our cutting‑edge memory technologies are essential in today’s most advanced electronic devices and IT infrastructure, enabling enhanced performance and user experiences across the digital landscape.

We’re looking for innovative minds to join our mission of shaping the future of technology. At SK hynix America, you’ll be part of a team pioneering breakthrough memory solutions while maintaining a strong commitment to sustainability. We’re not just adapting to technological change – we’re driving it, with significant investments in artificial intelligence, machine learning, and eco‑friendly solutions and operational practices. As we continue to expand our market presence and push the boundaries of what’s possible in semiconductor technology, we invite you to be part of our journey to create the next generation of memory solutions that will define the future of computing.

Job Overview As the AI/ML Computing Cluster engineer, you will develop and operate high‑performance computing clusters that support AI/ML workloads. You will be responsible for designing, implementing, operating, and optimizing AI data center IT environments to ensure scalability, performance, reliability, and cost‑effectiveness. This role requires collaboration with cross‑functional teams to align computing infrastructure with the organization’s strategic direction.

Responsibilities

Design and implement distributed computing cluster infrastructure to support large‑scale AI/ML model training and inference jobs with a focus on transformer‑based AI models.

Build and maintain distributed systems to ensure scalability, efficient resource allocation, and high throughput.

Optimize cluster performance through hardware selection, equipment configuration, network engineering, and performance analysis.

Deploy and operate data center networking infrastructure using software systems for automation, design validation, deployment, and operational support.

Implement tools and processes to maintain high uptime and ensure infrastructure reliability during both model training and inference phases.

Identify and resolve performance bottlenecks, improving overall system throughput and response times.

Collaborate with cross‑functional teams, including research, security, and benchmark test engineering teams, to integrate infrastructure with AI workflows and ensure seamless deployment and operation.

Engage with technology vendors and partners to evaluate new solutions that drive innovation in AI computing infrastructure.

Qualifications

Master’s degree or above in Computer Science, Electrical Engineering, or related fields.

2+ years of experience in AI cluster engineering, MLOps, and benchmark testing, including GPU performance analysis, memory usage, and energy/power monitoring tools.

Strong familiarity with AI computing architecture, AI/ML infrastructure requirements, memory architecture and usage in AI/ML, AI algorithm trends and best practices.

Expertise in optimizing resource utilization, improving system throughput, and reducing latency in both training and inference.

Compensation Our compensation reflects the cost of labor across several U.S. geographic markets and is based on defined markets. Pay within the provided range varies by work location and may also depend on job‑related skills and experience. Your recruiter can share more about the specific salary range for the job location during the hiring process.

Pay Range: $100,000 – $150,000 USD

Equal Employment Opportunity SKHYA is an Equal Employment Opportunity Employer. We provide equal employment opportunities to all qualified applicants and employees and prohibit discrimination and harassment of any type without regard to race, sex, pregnancy, sexual orientation, religion, age, gender identity, national origin, color, protected veteran or disability status, genetic information or any other status protected under federal, state, or local applicable laws.

#J-18808-Ljbffr