Sr. AI HW Quality Engineer

1 Month ago • 12-12 Years • Manufacturing

Job Summary

Job Description

As a Senior AI HW Quality Engineer at Microsoft, you'll develop and implement supplier quality management strategies for data center hardware. You'll lead quality improvement task forces, conduct debug and failure analysis for GPU subsystems, drive continuous improvement via RCA, and provide quality readouts based on telemetry data analysis. Responsibilities include establishing quality metrics, acting as the voice of quality in change management, and collaborating with diverse teams. The role requires extensive experience in managing manufacturing quality in the electronics industry and expertise in hardware system issue resolution for GPU servers.
Must have:
  • 12+ years relevant technical engineering experience
  • 8+ years managing manufacturing quality in electronics
  • 5+ years hardware system issue resolution for GPU Servers
  • Root cause analysis and corrective action expertise
  • Data analysis and presentation skills
  • Supplier quality management strategy development
Good to have:
  • Patent or track record of engineering excellence
  • Experience with modern server architectures (GPU, CPU)
  • System-level server debugging (power, system, network)
  • Direct GPU engineering experience in issue debug/test log review
Perks:
  • Industry leading healthcare
  • Educational resources
  • Discounts on products and services
  • Savings and investments
  • Maternity and paternity leave
  • Generous time away
  • Giving programs
  • Networking opportunities

Job Details

Overview

Microsoft Silicon, Cloud Hardware, and Infrastructure Engineering (SCHIE) is the team behind Microsoft’s expanding Cloud Infrastructure and responsible for powering Microsoft’s “Intelligent Cloud” mission. SCHIE delivers the core infrastructure and foundational technologies for Microsoft's over 200 online businesses including Bing, MSN, Office 365, Xbox Live, Teams, OneDrive, and the Microsoft Azure platform globally with our server and data center infrastructure, security and compliance, operations, globalization, and manageability solutions. Our focus is on smart growth, high efficiency, and delivering a trusted experience to customers and partners worldwide and we are looking for passionate, high-energy engineers to help achieve that mission.

 

As Microsoft's cloud business continues to grow the ability to deploy new offerings and hardware infrastructure on time, in high volume with high quality and lowest cost is of paramount importance. To achieve this goal, the Hardware, Infrastructure Management, and Fundamentals Engineering (HIFE) team is instrumental in defining and delivering operational measures of success for hardware manufacturing, improving the planning process, quality, delivery, scale and sustainability related to Microsoft cloud hardware. We are looking for seasoned engineers with a dedicated passion for customer focused solutions, insight and industry knowledge to envision and implement future technical solutions that will manage and optimize the Cloud infrastructure.

 

We are looking for a Senior HW Quality Engineer to join the team.

 

#azurehwjobs   #HIFE

Qualifications

Required Qualifications:

  • 12+ years relevant technical engineering experience
    • OR Bachelor's Degree in Mechanical Engineering, Materials Engineering, Reliability Engineering, Electrical Engineering, or related field AND 5+ years technical engineering experience
    • OR Master's Degree in Mechanical Engineering, Materials Engineering, Reliability Engineering, Electrical Engineering, or related field AND 4+ years technical engineering experience
    • OR Doctorate Degree in Mechanical Engineering, Materials Engineering, Reliability Engineering, Electrical Engineering, or related field AND 2+ years technical engineering experience.
  • 8+ years of work experience in managing manufacturing quality in the electronic industry. 
  • 5+ years of direct engineering experience in hardware system issue resolution for GPU Servers. 
  • Versed in filtering through applicable debug data, like telemetry and logs to identify and investigate HW failure signatures   

Preferred Qualifications:

  • Bachelor's Degree in manufacturing, material, mechanical, electrical, and industrial engineering, or related field AND 7+ years experience in a manufacturing environment/repair
    • OR Master's Degree in manufacturing, material, mechanical, electrical, and industrial engineering, or related field AND 6+ years experience in a high-volume manufacturing environment
    • OR Doctorate in manufacturing, material, mechanical, electrical, and industrial engineering, or related field AND 3+ years experience in a manufacturing environment/repair
    • OR 9+ years equivalent experience.
  • Patent or track record of engineering excellency.
  • 12+ years of experience in working with the modern server architectures – includes understanding of GPU, CPU methods for failure analysis, debugging or validation.
  • 8+ years of system level server debugging with an understanding of power, system and network environments
  • 3+ years of direct GPU related engineering experience in issue debug/test log review. 
  • Leadership skills and ability to collaborate with diverse teams and drive a call to action. 
  • Expert of root cause analysis and corrective action methods to identify contributing factors of production defects. 
  • Ability to analyze large data sets, extract key insights, and effectively present and communicate the results.
  • Proficient communication and project management skills. 

Responsibilities

  • Develop and implement a robust supplier quality management strategy to ensure the data center hardware is manufactured at the highest level of quality standards. 
  • Lead quality issues and improvement task force to contain, mitigate, and resolve the top-quality issues impacting global data centers. 
  • Conduct debug and failure analysis for GPU subsystems in the Azure fleet and drive resolution with partners and suppliers.
  • Drive the continuous improvement process based on Root Cause Analysis (RCA) and identified opportunities. 
  • Responsible for quality readouts based on your telemetry data analysis, to bring clarity on status, actions across the organization and next steps for issue resolution.
  • Establish Critical-to-Quality performance metrics to measure and improve product quality. 
  • Act as the voice of quality in the hardware change management process, ensuring quality requirements are considered and met and improved. 
Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.
Industry leading healthcare
Educational resources
Discounts on products and services
Savings and investments
Maternity and paternity leave
Generous time away
Giving programs
Opportunities to network and connect

Similar Jobs

Stonewall Collision & Auto Painting - Data Scientist

Stonewall Collision & Auto Painting

Hyderabad, Telangana, India (On-Site)
4 Months ago
Hitachi - Solution Architect

Hitachi

San José, San José Province, Costa Rica (On-Site)
3 Months ago
The Walt Disney Company - Lead Software Engineer (Identity)

The Walt Disney Company

Burbank, California, United States (On-Site)
2 Months ago
Gameopedia - Sr Manager IT

Gameopedia

Hyderabad, Telangana, India (On-Site)
2 Months ago
Fluxon - Senior Software Engineer

Fluxon

Hyderabad, Telangana, India (Remote)
3 Months ago
Salesforce - Senior Solution Architect - (Generative AI-Agentforce)

Salesforce

New York, New York, United States (Remote)
1 Week ago
Assystems - System Design Engineer

Assystems

Derby, England, United Kingdom (On-Site)
3 Months ago
Fluence - Chief Mechanical Engineer

Fluence

Erlangen, Bavaria, Germany (Hybrid)
3 Months ago
NXP - Internship – SW development for Touchsense

NXP

Bucharest, Bucharest, Romania (On-Site)
4 Months ago
Intel Corporation - Utilities Procurement Manager

Intel Corporation

New Albany, Ohio, United States (Hybrid)
1 Month ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Microsoft - Principal Software Engineering Manager

Microsoft

Bucharest, Bucharest, Romania (Remote)
1 Month ago
Microsoft - Critical Environment Technical Trainer

Microsoft

Jakarta, Jakarta, Indonesia (On-Site)
1 Month ago
PwC - D365 Azure Integration Developer-Manager

PwC

Kolkata, West Bengal, India (On-Site)
4 Months ago
Microsoft - SR. Hardware Manufacturing PM

Microsoft

Taipei City, Taiwan (On-Site)
1 Month ago
Paypal - Sr. Software Engineer, Data Governance

Paypal

San Jose, California, United States (Hybrid)
4 Months ago
Rackspace Technology - Lead AppDev Enterprise Architect

Rackspace Technology

United States (Remote)
2 Months ago
Microsoft - ROP - Cloud Network Engineer

Microsoft

Hyderabad, Telangana, India (On-Site)
1 Month ago
Microsoft - Sr. HW Quality Engineer

Microsoft

Taipei City, Taiwan (On-Site)
1 Month ago
Hitachi - Solution Architect

Hitachi

San José, San José Province, Costa Rica (On-Site)
3 Months ago
Luxoft - Senior Data Engineer

Luxoft

New Delhi, Delhi, India (Remote)
2 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Taipei City, Taiwan

Rivos - CPU Physical Design - Full Time

Rivos

Hsinchu, Hsinchu City, Taiwan (Hybrid)
3 Months ago
Tesla - Project Manager, Energy - Commercial Deployment

Tesla

Taipei City, Taiwan (On-Site)
3 Days ago
Google - Silicon Engineer, University Graduate, 2025

Google

Taipei City, Taiwan (On-Site)
2 Months ago
Rivos - SOC Physical Design - Full time

Rivos

Hsinchu, Hsinchu City, Taiwan (Hybrid)
3 Months ago
Rivos - Data Parallel Accelerator Performance Intern

Rivos

Hsinchu, Hsinchu City, Taiwan (Hybrid)
3 Months ago
Trend Micro - (Sr.) Cloud Developer (Vision One)

Trend Micro

Taipei City, Taiwan (On-Site)
4 Months ago
Appier - Campaign Analyst (US) 05:00 AM-02:00 PM working hours

Appier

Taipei City, Taiwan (On-Site)
3 Months ago
Netflix - Coordinator, Chinese Language Content

Netflix

Taipei City, Taiwan (On-Site)
1 Month ago
Tesla - Service Intern

Tesla

Taipei City, Taiwan (On-Site)
3 Days ago
Trend Micro - Staff/Sr. Cloud Service Engineer (VicOne_ Automotive Security)

Trend Micro

Taipei City, Taiwan (On-Site)
4 Months ago

Get notifed when new similar jobs are uploaded

Manufacturing Jobs

Axon - Sr. Mechanical Design Engineer (Onsite)

Axon

Scottsdale, Arizona, United States (On-Site)
2 Months ago
Trek - Service Technician (Part-Time)

Trek

Alamo, California, United States (On-Site)
1 Month ago
Trek - Sales Associate

Trek

Nashua, New Hampshire, United States (On-Site)
1 Week ago
Zoox - Manager, Vehicle Quality Support

Zoox

Hayward, California, United States (On-Site)
3 Months ago
Tesla - New Product Introduction Engineer, Vehicle

Tesla

Brandenburg, Germany (On-Site)
2 Days ago
Intel Corporation - Senior dry etch process engineer

Intel Corporation

Dalian, Liaoning, China (On-Site)
1 Month ago
Assystems - Ingénieur Systèmes - Nucléaire H/F

Assystems

Lyon, Auvergne-Rhône-Alpes, France (On-Site)
3 Months ago
Tesla - 25 Guns Taskforce Engineer

Tesla

Brandenburg, Germany (On-Site)
4 Days ago
Autodesk - Machine Learning Manager

Autodesk

San Francisco, California, United States (Hybrid)
4 Months ago
Assystems - Principal /Senior Electrical Engineer

Assystems

Glasgow, Scotland, United Kingdom (On-Site)
3 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Microsoft is a tech giant that develops, licenses, and supports a range of software products, services, and devices.

Milan, Lombardy, Italy (On-Site)

Gurugram, Haryana, India (On-Site)

Prague, Prague, Czechia (On-Site)

Montreal, Quebec, Canada (On-Site)

Dublin, County Dublin, Ireland (On-Site)

London, England, United Kingdom (On-Site)

Virginia, United States (On-Site)

Hyderabad, Telangana, India (On-Site)

View All Jobs

Get notified when new jobs are added by Microsoft

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug