Logo
Microsoft

Senior Software Engineer (AI/ML)

Microsoft, Irving, Texas, United States, 75084

Save Job

The Worldwide Fleet Resources Lifecycle Management team is dedicated to revolutionizing the management and optimization of Microsoft's global fleet resources. In addition to enhancing operational efficiency, reducing costs, and improving sustainability, the team is responsible for automating how new hardware is verified, managed, and delivered to Microsoft datacenters. This includes supporting Azure, High-Performance Computing, Office, and Edge Computing products within Microsoft. Job Overview

As a Senior Software Engineer (AI/ML) in the Worldwide Fleet Resources Lifecycle Management team, you will play a pivotal role in supporting the onboarding of new hardware into the Azure cloud and driving the integration of intelligence into tools, processes, and resources across the entire organization. You will also be expected to understand requirements, create designs, and implement features needed to enable new technologies. Responsibilities

Collaborates with stakeholders to identify user requirements, create design documents, and develop scalable systems and services. Works with product managers, engineers, and infrastructure teams to deliver impactful solutions. Utilizes strong software engineering fundamentals, including clean architecture, modular design, thorough testing, and peer reviews for reliable codebases. Develops and optimizes code to enhance performance, maintainability, effectiveness, and return on investment (ROI). Develops and deploys scalable Artificial Intelligence (AI)-driven tools, algorithms, and machine learning (ML) models to enhance efficiency, reliability, and productivity. Collaborates with data scientists and product teams to align solutions with business objectives and deliver measurable value. Optimizes AI/ML models for performance and ensures seamless production integration. Breaks down larger features into work items and supports planning, ensuring alignment with business priorities. Estimates engineering effort and tracks progress to ensure that all tasks are completed efficiently and effectively. Serves as the Designated Responsible Individual (DRI) for monitoring, troubleshooting, and restoring production systems during on-call rotations. Leads live-site incident response, conducts root cause analysis, and implements long-term improvements to enhance system reliability and operational readiness. Demonstrates a commitment to continuous learning, staying up to date with evolving technologies and best practices. Microsoft's mission is to empower every person and every organization on the planet to achieve more. As employees, we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

#J-18808-Ljbffr