Lenovo
Overview
The Lenovo AI Technology Center (LATC) – Lenovo's global AI Center of Excellence – is driving the transformation into an AI-first organization. We are assembling a world-class team of researchers, engineers, and innovators to position Lenovo and its customers at the forefront of the generational shift toward AI. Lenovo delivers products across wearables, smartphones, laptops, PCs, workstations, servers, and services/solutions, enabling rapid AI innovation and deployment across on-device, edge, and cloud environments. If you're ready to shape AI at a global scale, join us.
Position Summary Lenovo is seeking a technical leader to head our AI Model Deployment & Optimization team. In this high-impact role, you will drive the development and large-scale deployment of cutting-edge AI capabilities across Lenovo devices and platforms – from on-device inference to cloud-enabled workloads. You will adapt, fine-tune, and optimize open-source and proprietary foundation models for performance, efficiency, and user impact, ensuring they run across Windows and Android and various hardware architectures from Qualcomm, Nvidia, Intel, AMD, MediaTek, and others. Your team will sit at the intersection of AI software, hardware acceleration, and product innovation, pushing model compression, quantization, pruning, distillation, and hardware-aware AI optimization to reach hundreds of millions of users globally.
Key Responsibilities
Lead Lenovo's AI model deployment and optimization across devices, laptops, and cloud environments.
Adapt, fine-tune, and optimize open-source foundation models (e.g., OpenAI, Google, Microsoft, Meta) for Lenovo's product portfolio.
Drive initiatives in model compression, quantization, pruning, and distillation to maximize efficiency on constrained devices while preserving model quality.
Collaborate with hardware architecture teams to align AI model efficiency with device and accelerator capabilities.
Develop hardware-aware optimization algorithms and integrate them into model deployment pipelines.
Utilize the latest AI frameworks and libraries to achieve the best inference performance on both model and hardware.
Establish and maintain reproducible workflows, automation pipelines, and release-readiness criteria for AI models.
Build, mentor, and inspire a high-performance applied AI engineering team.
Required Qualifications
Experience: 10+ years in production software development, including AI/ML engineering, with 3+ years in leadership roles. Proven track record in model deployment and optimization at scale. Demonstrated ability to deliver production-grade AI models optimized for on-device and/or cloud environments.
Optimization Techniques: Strong expertise in quantization, pruning, distillation, graph optimization, mixed precision, and hardware-specific tuning (NPUs, GPUs, TPUs, custom accelerators).
Familiarity with model inference frameworks such as ONNX Runtime, TensorRT, TVM, OpenVINO, RadeonML, QNN, and NeuroPilot.
Data & Telemetry: Building feedback loops from runtime telemetry to guide retraining, routing, and optimization.
Preferred Qualifications
Graduate degree (MS or PhD) in Computer Science, AI/ML, Computational Engineering, or related field.
Experience delivering production-grade model inferencing solutions for both cloud and edge.
Familiarity with edge device constraints related to Windows or Android, preferably both.
Track record of collaboration with research institutions and contributions to open-source AI optimization libraries.
Security & Compliance: Ensuring secure deployments, model integrity verification, and adherence to privacy regulations.
Excellent leadership, communication, and cross-functional collaboration skills.
#LATC #LATT-LT
Equal Opportunity We are an Equal Opportunity Employer and do not discriminate against any employee or applicant for employment because of race, color, sex, age, religion, sexual orientation, gender identity, national origin, veteran status, disability, or any federal, state, or local protected class.
Additional Locations * United States of America - North Carolina - Morrisville
#J-18808-Ljbffr
Position Summary Lenovo is seeking a technical leader to head our AI Model Deployment & Optimization team. In this high-impact role, you will drive the development and large-scale deployment of cutting-edge AI capabilities across Lenovo devices and platforms – from on-device inference to cloud-enabled workloads. You will adapt, fine-tune, and optimize open-source and proprietary foundation models for performance, efficiency, and user impact, ensuring they run across Windows and Android and various hardware architectures from Qualcomm, Nvidia, Intel, AMD, MediaTek, and others. Your team will sit at the intersection of AI software, hardware acceleration, and product innovation, pushing model compression, quantization, pruning, distillation, and hardware-aware AI optimization to reach hundreds of millions of users globally.
Key Responsibilities
Lead Lenovo's AI model deployment and optimization across devices, laptops, and cloud environments.
Adapt, fine-tune, and optimize open-source foundation models (e.g., OpenAI, Google, Microsoft, Meta) for Lenovo's product portfolio.
Drive initiatives in model compression, quantization, pruning, and distillation to maximize efficiency on constrained devices while preserving model quality.
Collaborate with hardware architecture teams to align AI model efficiency with device and accelerator capabilities.
Develop hardware-aware optimization algorithms and integrate them into model deployment pipelines.
Utilize the latest AI frameworks and libraries to achieve the best inference performance on both model and hardware.
Establish and maintain reproducible workflows, automation pipelines, and release-readiness criteria for AI models.
Build, mentor, and inspire a high-performance applied AI engineering team.
Required Qualifications
Experience: 10+ years in production software development, including AI/ML engineering, with 3+ years in leadership roles. Proven track record in model deployment and optimization at scale. Demonstrated ability to deliver production-grade AI models optimized for on-device and/or cloud environments.
Optimization Techniques: Strong expertise in quantization, pruning, distillation, graph optimization, mixed precision, and hardware-specific tuning (NPUs, GPUs, TPUs, custom accelerators).
Familiarity with model inference frameworks such as ONNX Runtime, TensorRT, TVM, OpenVINO, RadeonML, QNN, and NeuroPilot.
Data & Telemetry: Building feedback loops from runtime telemetry to guide retraining, routing, and optimization.
Preferred Qualifications
Graduate degree (MS or PhD) in Computer Science, AI/ML, Computational Engineering, or related field.
Experience delivering production-grade model inferencing solutions for both cloud and edge.
Familiarity with edge device constraints related to Windows or Android, preferably both.
Track record of collaboration with research institutions and contributions to open-source AI optimization libraries.
Security & Compliance: Ensuring secure deployments, model integrity verification, and adherence to privacy regulations.
Excellent leadership, communication, and cross-functional collaboration skills.
#LATC #LATT-LT
Equal Opportunity We are an Equal Opportunity Employer and do not discriminate against any employee or applicant for employment because of race, color, sex, age, religion, sexual orientation, gender identity, national origin, veteran status, disability, or any federal, state, or local protected class.
Additional Locations * United States of America - North Carolina - Morrisville
#J-18808-Ljbffr