Principal Software Engineer - GPU Performance

2 Weeks ago • 8-10 Years • Artificial Intelligence • $161,600 PA - $314,400 PA

About the job

Job Description

Microsoft's AI Platform organization seeks a Principal Software Engineer to focus on GPU performance analysis and optimization for large-scale AI model training and inference. The role involves collaborating with hardware teams, ML developers, and OpenAI to build and optimize software stacks for next-generation AI supercomputers and accelerators (like Maia-100). Responsibilities include software development (C/C++, Python, CUDA, ROCm, Triton), performance analysis, identifying requirements, and collaborating with various teams to deliver robust solutions for state-of-the-art AI models. This is a hands-on technical role demanding expertise in GPU programming and optimization techniques.
Must have:
  • 8+ years experience
  • 4+ years C/C++ experience
  • 4+ years GPU application experience
  • GPU kernel optimization
  • Collaboration skills
Good to have:
  • Advanced degree
  • Low-level programming expertise
  • Profiling tool proficiency (NVIDIA tools)
  • Deep learning workload experience
  • CUDA, ROCm, or Triton experience
Perks:
  • Industry leading healthcare
  • Educational resources
  • Product and service discounts
  • Savings and investments
  • Maternity/paternity leave
  • Generous time away
  • Giving programs
  • Networking opportunities

Overview

Microsoft is a company where passionate innovators come to collaborate, envision what can be and take their careers further. This is a world of more possibilities, more innovation, more openness, and the sky is the limit thinking in a cloud-enabled world.

 

The Artificial Intelligence (AI) Platform organization at Microsoft builds the end-to-end Azure AI stack/Platform as a Service (PaaS) and is core to Azure’s innovation and differentiation, as well as all of Microsoft’s flagship products, from Office to Teams, to Xbox. We are the team building Azure OpenAI, Azure Machine Learning (ML), Cognitive Services, and the global Azure AI infrastructure for running the largest AI workloads on the planet.

 

We do not just value differences or different perspectives. We seek them out and invite them in so we can tap into the collective power of everyone in the company. As a result, our customers are better served.

The Artificial Intelligence (AI) Frameworks team at Microsoft develops the AI software used to train and deploy the world’s most advanced AI models. We collaborate with our hardware teams and partners to build the software stacks for Microsoft’s next-generation supercomputers and the new Maia-100 AI accelerator.  We work closely with ML researchers and developers to optimize and scale out model training and inference.  We work directly with OpenAI on the models hosted on the Azure OpenAI service.

We are hiring a Principal Software Engineer to work on graphics processing unit (GPU) performance analysis and optimization.  As a member of this team, you will have the opportunity to work on the fundamental abstractions, programming models, runtimes, libraries and application programming interfaces (APIs) to enable large scale training and inferencing of models on novel AI hardware.

 

This is a technical role: it requires hands on software design and development skills. We’re looking for someone who has a demonstrated history of solving hard technical problems and is motivated to tackle the hardest problems in building a full end-to-end AI stack.  An entrepreneurial approach and ability to take initiative and move fast are essential.

 

In alignment with our Microsoft values, we are committed to cultivating an inclusive work environment for all employees to positively impact our culture every day.

Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Qualifications

Required/Minimum Qualifications 

  • Bachelor's Degree in Computer Science, or related technical discipline AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python

    • OR equivalent experience.

  • 4+ years' experience with C/C++
  • 4+ years’ practical experience working on real-world applications that use GPUs, experience in optimizing GPU kernels for performance

 

Other Requirements:

Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings:

  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

 

Preferred/Additional Qualifications 

  • Bachelors Degree or advanced degree in computer engineering, computer science, or related fields, and 10+ years of software development experience.
  • Experience in low-level program behavior, including performance and memory usage, proficiency using profiling tools such as NVIDIA Visual Profiler, nvprof, and NVIDIA Nsight Compute
  • Technical background and foundation in software engineering principles, architecture design, and performance analysis
  • Intellectual curiosity and passion about learning new technologies
  • Exposure to state-of-the art Deep Neural Network training and inference workloads, including research techniques
  • Great cross-team collaboration skills and the desire to collaborate in a team of researchers and developers
Software Engineering IC6 - The typical base pay range for this role across the U.S. is USD $161,600 - $286,200 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $209,600 - $314,400 per year.
  
Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:
   
Microsoft will accept applications and processes offers for these roles on an ongoing basis.

 

 

#AIFX

#SWE24

#SHPE24MSFT

Responsibilities

  • Collaborate broadly across multiple disciplines from hardware designers to ML developers
  • Engage with key partners to understand and implement robust performance analysis and optimization for state-of-the-art large language models (LLMs) and other models.
  • Perform software development in C/C++, Python, and GPU development in languages such as CUDA, ROCm, or Triton.
  • Identify requirements, scope solutions, estimate work, schedule deliverables.
  • Embody our and 
Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.
Industry leading healthcare
Educational resources
Discounts on products and services
Savings and investments
Maternity and paternity leave
Generous time away
Giving programs
Opportunities to network and connect
View Full Job Description
$161.6K - $314.4K/yr (Outscal est.)
$238.0K/yr avg.
Redmond, Washington, United States

Add your resume

80%

Upload your resume, increase your shortlisting chances by 80%

About The Company

Microsoft is a tech giant that develops, licenses, and supports a range of software products, services, and devices.

Cambridge, England, United Kingdom (On-Site)

Texas, United States (On-Site)

Phoenix, Arizona, United States (On-Site)

Bengaluru, Karnataka, India (On-Site)

View All Jobs

Get notified when new jobs are added by Microsoft

Similar Jobs

Microsoft - Senior Software Engineer Lead

Microsoft, Egypt (On-Site)

Global Business Travel - Senior Software Development Engineer

Global Business Travel, India (On-Site)

UXBERT Labs - Senior Backend Developer

UXBERT Labs, Saudi Arabia (Hybrid)

version 1 - Microsoft Azure ML Engineer

version 1, United Kingdom (On-Site)

TVH - Data Scientist

TVH, India (On-Site)

Unity - Principal Machine Learning Engineer

Unity, United States (On-Site)

BLUEBOT DIGITAL - Creative Art Director

BLUEBOT DIGITAL, India (On-Site)

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Paypal - Machine Learning Engineer

Paypal, United States (Hybrid)

Global Business Travel - Senior Software Development Engineer

Global Business Travel, India (On-Site)

MiQ - Software Engineer II

MiQ, India (Hybrid)

Metyis - Lead Devops Engineer

Metyis, India (On-Site)

Playtika - Senior Java Developer

Playtika, Romania (Hybrid)

N-iX - Senior Data Engineer (#2327)

N-iX, Ukraine (Remote)

GoTo Group - Software Engineer - Identity Platform

GoTo Group, India (On-Site)

Electronic Arts - Technical Artist

Electronic Arts, Malaysia (On-Site)

Rajalakshmi Institute of Technology - DevOps Lead - CI/CD with Gitlab Only

Rajalakshmi Institute of Technology, India (Hybrid)

EPAM Systems - Senior Python Software Engineer

EPAM Systems, India (Remote)

Get notifed when new similar jobs are uploaded

Jobs in Redmond, Washington, United States

ByteDance - Senior Project Manager – Smart Wearables

ByteDance, United States (On-Site)

The Walt Disney Company - Addressable Sales Planner

The Walt Disney Company, United States (On-Site)

The Walt Disney Company - Software Engineer, Post Production Engineering

The Walt Disney Company, United States (On-Site)

Nintendo - CONTRACT - Associate UI Artist (NST)

Nintendo, United States (Hybrid)

Patreon - Senior Frontend Platform Engineer

Patreon, United States (Hybrid)

Salt AI - Senior Fullstack Engineer

Salt AI, United States (Remote)

Next Level Business Services - Big Data Engineer

Next Level Business Services, United States (On-Site)

Bally's Interactive - Risk & Fraud Analyst

Bally's Interactive, United States (On-Site)

Blizzard Entertainment - Principal 3D Character Artist - Unannounced | Irvine, CA

Blizzard Entertainment, United States (Hybrid)

Get notifed when new similar jobs are uploaded

Artificial Intelligence Jobs

Microsoft - Senior Researcher – Artificial Intelligence

Microsoft, United States (On-Site)

Talentica Software - Data Scientist

Talentica Software, India (Remote)

Microsoft - Senior Research Engineer, MSR AI for Science

Microsoft, Netherlands (On-Site)

Xsolla - Principal AI Engineer

Xsolla, United States (On-Site)

Twitch - Applied Scientist - Safety ML

Twitch, United States (On-Site)

AI Fund - Curriculum Developer

AI Fund, Germany (Remote)

Henkel - Data Scientist-Intern

Henkel, India (On-Site)

Level AI - Senior Backend Engineer - CX

Level AI, India (Hybrid)

Social Discovery Group - Senior/Team Lead NLP engineer

Social Discovery Group, (Remote)

Get notifed when new similar jobs are uploaded