AI Hardware/Software Co-design Engineer II

1 Hour ago • 2-4 Years • DevOps

About the job

Job Description

This AI Hardware/Software Co-design Engineer II role at Microsoft's Azure Hardware Systems & Infrastructure (AHSI) involves collaborating with various teams to improve the performance and power efficiency of Azure hardware systems. Responsibilities include developing performance modeling methodologies, benchmarking GPU performance for AI workloads, identifying and resolving performance bottlenecks, and creating dashboards for performance visualization. The role requires working with AI accelerators (GPUs/DSAs), experience with CUDA and high-performance AI libraries, and proficiency in C++, Python, and scripting languages. The successful candidate will contribute to technology development planning, system power and performance management, and guide teams in software development and deployment.
Must have:
  • 2+ years experience with AI Accelerators (GPUs or DSAs)
  • Deep understanding of computer architecture and performance tradeoffs
  • Experience programming AI Accelerators/CUDA, high-performance AI libraries
  • Proficiency in C++, Python, and scripting languages
  • Proficient problem-solving and communication skills
Good to have:
  • Experience with Tensorflow/Pytorch
  • Experience developing analysis tools in C++ and Python
  • Knowledge of performance monitors and tuning
  • Familiarity with PowerBI
  • Master's Degree in Electrical Engineering/Computer Engineering
Perks:
  • Industry leading healthcare
  • Educational resources
  • Discounts on products and services
  • Savings and investments
  • Maternity and paternity leave
  • Generous time away
  • Giving programs
  • Networking opportunities

Overview

Do you want to be at the forefront of innovating the latest hardware designs to propel Microsoft’s cloud growth? Are you seeking a unique career opportunity that combines both technical capabilities, cross team collaboration, with business insight and strategy?  

 

Join our Strategic Planning and Architecture (SPARC) team within Microsoft’s Azure Hardware Systems & Infrastructure (AHSI) organization and be a part of the organization behind Microsoft’s expanding Cloud Infrastructure and responsible for powering Microsoft’s “Intelligent Cloud” mission.   

Microsoft delivers more than 200 online services to more than one billion individuals worldwide and AHSI is the team behind our expanding cloud infrastructure. We deliver the core infrastructure and foundational technologies for Microsoft's cloud businesses including Microsoft Azure, Bing, MSN, Office 365, OneDrive, Skype, Teams and Xbox Live.   

 

The SPARC organization manages Azure’s hardware roadmap from architecture concept through  

production for all of Microsoft’s current and future on-line services.   

 

We are looking for an AI Hardware/Software Co-design Engineer II to join the System Architecture team focusing on architecture and performance aspects of Microsoft’s Azure hardware systems deployed in various data centres across the globe.  

 

Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

 

Qualifications

Minimum Qualifications: 

 

  • Bachelor’s Degree in Electrical Engineering, Computer Engineering, Mechanical Engineering, or related field AND 2+ years technical engineering experience

    • OR Master’s Degree in Electrical Engineering, Computer Engineering, Mechanical Engineering, or related field

    • OR equivalent experience.

  •  2+ years of experience working with AI Accelerators such as GPUs or DSAs.

 

Other Requirements:

 

Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to, the following specialized security screenings:

  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

 

Preferred qualifications 

 

  • 4+ years related technical engineering experience  
    • OR Master's Degree in Electrical Engineering, Computer Engineering AND 2+ years technical engineering experience 
    • OR Bachelor's Degree in Electrical Engineering, Computer Engineering,  AND  4+ years technical engineering experience 
  •  
  • Deep understanding of computer architecture, SOC and SW architectures, and their performance tradeoffs. 
  • Working knowledge of prevailing LLM models and frameworks like Tensorflow, Pytorch is a plus 
  • Experience programming AI Accelerators/experience with CUDA, high performance AI libraries is a plus 
  • Experience in development of analysis tools written in C++ and Python. 
  • Knowledge of performance monitors and performance tuning. 
  • Proficiency in scripting languages such as Python, Bash, or PowerShell. 
  • Proficient problem-solving skills and attention to detail. 
  • Proficient communication and collaboration skills. 
  • Familiarity with visualization and reporting tools like PowerBI is a plus 

 

Hardware Engineering IC3 - The typical base pay range for this role across the U.S. is USD $98,300 - $193,200 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $127,200 - $208,800 per year.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:

Microsoft will accept applications for the role until January 27, 2025

 

Responsibilities

 

  • Work with business, architecture, and design teams to understand performance requirements and collaborate across functional teams to meet these needs in technology development planning and path finding. 
  • Work with platform, firmware, and software teams across Microsoft to identify opportunities to improve system power and performance management with a goal of improved power efficiency across the stack. 
  • Develop in-house performance modelling methodology and tools for Machine Learning systems. 
  • Benchmark and analyze GPU performance for business critical AI workloads 
  • Identify performance bottlenecks, optimize resource utilization, and implement improvements to enhance performance. 
  • Come up with dashboards to maintain Performance visualization and build infrastructure for improving the analysis framework 
  • Guide teams in designing, building, testing, and deploying changes to existing software. 
  • Embody our and

 

Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.
Industry leading healthcare
Educational resources
Discounts on products and services
Savings and investments
Maternity and paternity leave
Generous time away
Giving programs
Opportunities to network and connect
View Full Job Description
$98.3K - $208.8K/yr (Outscal est.)
$153.6K/yr avg.
Mountain View, California, United States

Add your resume

80%

Upload your resume, increase your shortlisting chances by 80%

About The Company

Microsoft is a tech giant that develops, licenses, and supports a range of software products, services, and devices.

New York, New York, United States (On-Site)

Redmond, Washington, United States (On-Site)

Redmond, Washington, United States (On-Site)

London, England, United Kingdom (On-Site)

Redmond, Washington, United States (On-Site)

Redmond, Washington, United States (On-Site)

North Holland, Netherlands (Hybrid)

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)

London, England, United Kingdom (On-Site)

Charlotte, North Carolina, United States (Hybrid)

View All Jobs

Get notified when new jobs are added by Microsoft

Similar Jobs

Maxis Studios - Backend Software Engineer

Maxis Studios, Canada (On-Site)

Omind - Senior DevOps Engineer

Omind, India (On-Site)

Fliff  Inc  - Senior DevOps Engineering Manager

Fliff Inc , Bulgaria (Remote)

Lirio - Security Engineer

Lirio, (Remote)

Luxoft - Release Technical Engineer

Luxoft, India (On-Site)

Innoactive - Software Engineer

Innoactive, (Remote)

Microsoft - Data Engineer

Microsoft, India (On-Site)

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Salesforce - System Administration & Infrastructure

Salesforce, India (On-Site)

Forescout Technologies  Inc  - Manager Devops

Forescout Technologies Inc , India (On-Site)

Playrix - Automation Support Engineer

Playrix, Ukraine (Remote)

PlayStation Global - Site Reliability Engineer II

PlayStation Global, United States (On-Site)

Hitachi - Solution Architect

Hitachi, Costa Rica (On-Site)

Playrix - Lead QA Engineer (Resources Team)

Playrix, Montenegro (Remote)

Scopely - 2D Game Artist (Generalist)

Scopely, India (Hybrid)

Rackspace Technology - Linux Systems Engineer - IN

Rackspace Technology, India (Hybrid)

SuperPlay - DEVOPS ENGINEER

SuperPlay, Israel (On-Site)

Get notifed when new similar jobs are uploaded

Jobs in Mountain View, California, United States

Duolingo - Senior Product Manager, Math

Duolingo, United States (On-Site)

HoYoverse - Senior Game Recruiter

HoYoverse, United States (Remote)

ION - Senior Technical Consultant - Endur

ION, United States (On-Site)

Kyruus Health - Senior Technical Project Manager

Kyruus Health, United States (On-Site)

Next Level Business Services - Systems Engineer

Next Level Business Services, United States (On-Site)

Next Level Business Services - SAP MM

Next Level Business Services, United States (On-Site)

Flow - Facilities Coordinator

Flow, United States (On-Site)

Visual Concepts - Senior Backend Engineer, NBA 2K

Visual Concepts, United States (On-Site)

Get notifed when new similar jobs are uploaded

DevOps Jobs

Get notifed when new similar jobs are uploaded