Principal AI Platform Architect

1 Month ago • 10-10 Years • DevOps • $137,600 PA - $294,000 PA

Job Summary

Job Description

Microsoft seeks a Principal AI Platform Architect to lead the design and architecture of next-generation AI supercomputers and platforms within the Azure ecosystem. This role demands expertise in AI platform architecture, rack-scale server design, and collaboration with diverse engineering teams (software, electrical, mechanical, thermal). Responsibilities include driving architectural concepts, partnering with silicon development organizations, and conducting trade-off studies considering TCO, performance, and power efficiency. The ideal candidate will possess deep experience in deploying AI/GPU systems at scale and have a strong understanding of AI workloads and hardware impact on performance.
Must have:
  • 10+ years AI platform architecture experience
  • Expertise in co-designing with datacenter/server teams
  • Ability to articulate architectural tradeoffs
  • Drive technology partners towards optimal solutions
Good to have:
  • Experience deploying AI systems at scale in cloud environments
  • Knowledge of AI training/inference workloads
  • TCO analysis expertise
  • PCBA design experience
Perks:
  • Industry-leading healthcare
  • Educational resources
  • Product and service discounts
  • Savings and investment programs
  • Maternity/paternity leave
  • Generous time away
  • Giving programs
  • Networking opportunities

Job Details

Overview

The Azure Platform Architecture team is at the forefront of technology and system design, leading the way for the next generation of systems and AI super computers. Our mission is to architect the most performant, secure, reliable, and cost and power optimized solutions that are deployed and managed at hyperscale and power Azure. Leading the AI platform architecture for these systems that power one of the largest hardware deployments on earth requires deep technical knowledge and partnership across many teams. This individual will act as the subject matter expert and platform architect for Microsoft internal Artificial intelligence (AI) Accelerator family products, helping articulate and define our next generation platforms. This requires working across multiple domains including product, software, electrical, mechanical, thermal, performance, and deployment to find the right solution trade-offs.   

We are looking for a Principal AI Platform Architect to join the team. 

Our team is part of a broader hardware and infrastructure organization known as the Silicon, Cloud Hardware, and Infrastructure Engineering (SCHIE). SCHIE is the team behind Microsoft’s expanding Cloud Infrastructure and is responsible for powering Microsoft’s “Intelligent Cloud” mission. SCHIE delivers the core infrastructure and foundational technologies for Microsoft's 200+ online businesses including Teams, OneDrive, Office 365, Xbox Live, Skype, Bing, MSN, and the Microsoft Azure platform globally. 

 We architect and design the server and data center infrastructure, security and compliance, operations, globalization, and manageability solutions to support these businesses. Our focus is on smart growth, high efficiency, and deliver trusted experience to customers and partners worldwide. As Microsoft's cloud business continues to grow the ability to deploy new offerings and HW infrastructure on time, at hyperscale, with high reliability and the best performance/price level is paramount.  

Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.  

In alignment with our Microsoft values, we are committed to cultivating an inclusive work environment for all employees to positively impact our culture every day 

Qualifications

Required Qualifications:  

  • 10+ years of technical engineering experience 

o OR Bachelor's degree in Electrical Engineering, Computer Engineering, Mechanical Engineering, or related field AND 8+ years of technical engineering experience 

o OR Master's degree in Electrical Engineering, Computer Engineering, Mechanical Engineering, or related field AND 5+ years of technical engineering experience 

o OR Doctorate degree in Electrical Engineering, Computer Engineering, Mechanical Engineering, or related field AND 4+ years of technical engineering experience. 

  • 10+ years demonstratedexpertise in AI platform and/or rack-scale server architecture and design.  
  • 10+ years demonstratedexpertise in co-designing with datacenter, server, silicon, firmware/software orchestration, and manufacturing engineering organizations.  

 

Other Requirements:  

Ability to meet Microsoft, customer and/or government security screening requirements arerequired for this role. These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.   

 

Preferred Qualifications: 

  • Experience deploying AI or GPU systems at scale within a cloud service provider or hyper-scale company. 
  • Knowledge of AI training and inference workloads, and understanding of how hardware impacts AI performance, operations, and efficiencies. 
  • Ability to analyze AI system concepts from a total cost of ownership (TCO), performance per TCO, and performance per watt perspective, including understanding system constraints that drive design tradeoffs. 
  • Expertise in conducting tradeoff studies for electrical, mechanical, thermal, and hardware systems. 
  • Experience in PCBA (Printed Circuit Board Assembly) design, including schematic creation, layout, routing, power, and signal integrity. 

 

Hardware Engineering IC5 - The typical base pay range for this role across the U.S. is USD $137,600 - $267,000 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $180,400 - $294,000 per year.    Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:        

 

Microsoft will accept applications for the role until Jan 11th, 2025. 

 

 

#AfroTech2024 

Responsibilities

  • Drive platform, rack, and datacenter-level architectural concepts and definition for Microsoft AI system products.  
  • Build relationships with our internal silicon development organizations, technology, and development partners to drive leading edge innovation into our next generation products.   
  • Partner across Microsoft teams and collaborate to deliver industry leading products.  
  • Distill and articulate architectural tradeoffs encompassing electrical, signal integrity, mechanical, power, and thermal inputs in terms of key metrics such as Total Cost of Ownership TCO, performance, power efficiency, schedule, and risk.  
  • Drive and influence technology providers and design partners towards optimal components and solutions to meet the future requirements for Azure’s infrastructure. 
  • Embody our and  
Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.
Industry leading healthcare
Educational resources
Discounts on products and services
Savings and investments
Maternity and paternity leave
Generous time away
Giving programs
Opportunities to network and connect

Similar Jobs

PwC - IN-Senior Manager – Crm Technical -Ms Dynamics– Advisory  - Mumbai

PwC

Mumbai, Maharashtra, India (On-Site)
3 Months ago
Microsoft - Senior SoC Hardware Validation Engineer

Microsoft

Hillsboro, Oregon, United States (On-Site)
1 Month ago
Microsoft - Global Senior CE Application Service Engineer

Microsoft

Redmond, Washington, United States (Hybrid)
1 Month ago
Microsoft - Senior Firmware Engineer

Microsoft

Bengaluru, Karnataka, India (On-Site)
1 Month ago
Microsoft - Data Scientist II

Microsoft

Hyderabad, Telangana, India (On-Site)
1 Month ago
Nagarro - Senior Staff Engineer - Python Full Stack

Nagarro

Colombia (Remote)
1 Month ago
WEKA - Senior Technical Services Engineer

WEKA

Bengaluru, Karnataka, India (On-Site)
2 Months ago
Intel Corporation - Infrastructure and Design Automation Engineer – Foundry Services (MAG)

Intel Corporation

Hillsboro, Oregon, United States (Hybrid)
2 Months ago
Extreme Network - Staff/Principal Software Engineer – Edge compute -Containerization 9401

Extreme Network

Toronto, Ontario, Canada (Hybrid)
3 Months ago
United Airlines - Senior Engineer - Machine Learning

United Airlines

Gurugram, Haryana, India (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Zones - Cloud Engineer

Zones

Mumbai, Maharashtra, India (On-Site)
1 Month ago
Luxoft - Senior Data Engineer

Luxoft

(Remote)
2 Months ago
Microsoft - SENIOR SOFTWARE ENG MGR

Microsoft

Hyderabad, Telangana, India (On-Site)
1 Month ago
Microsoft - Hardware Engineer 2

Microsoft

Mountain View, California, United States (On-Site)
1 Month ago
State Street - Security Engineer - Manager

State Street

Hyderabad, Telangana, India (Hybrid)
3 Months ago
PwC - IN_Manager_Delivery Manager_Data & Analytics_Advisory_Bangalore

PwC

Bengaluru, Karnataka, India (On-Site)
4 Months ago
Microsoft - Director, Reliability Engineering

Microsoft

Redmond, Washington, United States (On-Site)
1 Month ago
PwC - IN-Manager_OCI Manager_OneCloud_Advisory_Bangalore

PwC

Bengaluru, Karnataka, India (On-Site)
3 Months ago
PwC - IN_Manager – Crm Technical -Ms Dynamics– Advisory  -Bangalore

PwC

Bengaluru, Karnataka, India (On-Site)
3 Months ago
Microsoft - Principal SDE - M365 Security Engineering

Microsoft

Hyderabad, Telangana, India (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Jobs in Redmond, Washington, United States

On Location - Catering Performance Sr. Director - FIFA World Cup 26™

On Location

New York, New York, United States (Remote)
2 Months ago
Rivos - Platform FPGA Design

Rivos

Santa Clara, California, United States (On-Site)
3 Months ago
Google - People Analytics Intern, PhD, Summer 2025

Google

Mountain View, California, United States (On-Site)
1 Month ago
Activision - Expert Software Engineer (Privacy Data)

Activision

Santa Monica, California, United States (On-Site)
1 Month ago
Microsoft - Director AI Silicon Product Management

Microsoft

Redmond, Washington, United States (On-Site)
1 Month ago
Varonis  - DevOps Engineer

Varonis

United States (On-Site)
3 Months ago
Google - Software Engineer III, Mobile (Android), Google Workspace

Google

Sunnyvale, California, United States (On-Site)
3 Months ago
Match Group - Sr. Software Engineer, Machine Learning Infrastructure

Match Group

Palo Alto, California, United States (Hybrid)
3 Months ago
Aristocrat Gaming - Project Management Analyst

Aristocrat Gaming

Tulsa, Oklahoma, United States (Hybrid)
1 Month ago
Glean - Product Manager, Glean for Support Teams

Glean

Palo Alto, California, United States (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded

DevOps Jobs

ByteDance - SRE and DevOps Tech Lead - Edge Cloud Infrastructure - London

ByteDance

London, England, United Kingdom (On-Site)
2 Months ago
Electronic Arts - Software Engineer

Electronic Arts

Bucharest, Bucharest, Romania (Remote)
3 Months ago
Dream Games - SecOps Engineer

Dream Games

İstanbul, Türkiye (On-Site)
6 Months ago
Google - Staff Software Engineer, NodeOS

Google

(On-Site)
1 Month ago
VGW - Site Reliability Engineer Supervisor

VGW

Perth, Western Australia, Australia (On-Site)
3 Months ago
Extreme Network - Staff Cloud Operations Engineer-Cloud Operations team

Extreme Network

Hangzhou, Zhejiang, China (Hybrid)
3 Months ago
ION - Senior DevSecOps Engineer, Italy

ION

Pisa, Tuscany, Italy (On-Site)
3 Months ago
Luxoft - ETL Developer - Python

Luxoft

Gurugram, Haryana, India (On-Site)
1 Month ago
Google - Customer Engineer, Federal Civilian, Public Sector

Google

Reston, Virginia, United States (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

About The Company

Microsoft is a tech giant that develops, licenses, and supports a range of software products, services, and devices.

London, England, United Kingdom (On-Site)

Jakarta, Jakarta, Indonesia (On-Site)

Milan, Lombardy, Italy (On-Site)

Gurugram, Haryana, India (On-Site)

Prague, Prague, Czechia (On-Site)

Montreal, Quebec, Canada (On-Site)

Dublin, County Dublin, Ireland (On-Site)

London, England, United Kingdom (On-Site)

Virginia, United States (On-Site)

View All Jobs

Get notified when new jobs are added by Microsoft

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug