Member of Technical Staff, High Performance Computing Engineer

2 Weeks ago • 6-6 Years • Artificial Intelligence • $137,600 PA - $294,000 PA

Job Summary

Job Description

Microsoft AI is seeking experienced High Performance Computing Engineers to contribute to the evolution of Copilot. Responsibilities include building secure and performant AI Platform services, collaborating with other engineers and researchers, shipping high-quality code, overcoming roadblocks to deliver work quickly, and thriving in a fast-paced environment. The role involves working on large-scale supercomputers and developing APIs to enhance Copilot's functionalities. The ideal candidate will possess strong problem-solving skills, excellent communication abilities, and a collaborative work ethic. This position requires working in the office 3 days a week and is located in Mountain View, CA.
Must have:
  • Build secure & performant AI Platform services
  • Collaborate with Platform, infrastructure, application engineers & AI Researchers
  • Ship high-quality, well-tested, secure, and maintainable code
  • 6+ years experience with high-scale training clusters (e.g., Nvidia InfiniBand, SLURM, Kubernetes, Ray)
  • 6+ years experience building scalable services on public cloud (Azure, AWS, GCP)
  • Proficiency in Python, C#, C++, Rust, or Java
Good to have:
  • Experience with LLM training clusters
  • Experience with AI platforms, frameworks, and APIs
  • Experience using Machine Learning frameworks
  • Ability to identify and resolve complex technical issues

Job Details

Job Description

Overview
As Microsoft AI we are pushing the boundaries of technology. 
We are creating unique, beautiful and powerful products that will change lives. A small, friendly, fast-moving team, we support each other to do the best work of our lives, always looking to break new ground, fast. We are proud of what we build, how we build it and that our products will define the AI era. We run lean, obsess about users, and always make our decisions based on the evidence. We ship regularly, so your work will have real and immediate impact.
We are seeking experienced High Performance Computing Engineers to join our team and contribute to the evolution of our personal AI, Copilot. This role offers the unique opportunity to work on some of the largest scale supercomputers in the world, a rare chance to operate at such a significant scale. The right candidate will bring a wealth of positive energy, empathy, and kindness, coupled with a track record of effectiveness. You'll be proactive, relishing the challenge of crafting top-tier consumer experiences and products swiftly and efficiently. Our team is at the forefront of developing APIs that enhance our ability to fine-tune and deploy Copilot's core functionalities, in partnership with our Product Management, Design, and AI Research teams.
 
Our newly formed organization, Microsoft AI, is dedicated to advancing Copilot and other consumer AI products and research. The team is responsible for Copilot, Bing, Edge, and generative AI research. Come be a part of the team shaping the future personal computing.

Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond. 
 
By applying to this Mountain View, CA position, you are required to be local to the San Francisco area and in office 3 days a week. 
 
Responsibilities: 
  • Build secure and performant AI Platform services that power Copilot.
  • Work collaboratively with other Platform, infrastructure, application engineers as well as AI Researchers to build next generation AI products and services.
  • Ship high-quality, well-tested, secure, and maintainable code.
  • Find a path to get things done despite roadblocks to get your work into the hands of users quickly and iteratively.
  • Enjoy working in a fast-paced, design-driven, product development cycle.
  • Embody our and 

Required/Minimum Qualifications:
  • Bachelor’s degree in computer science, or related technical discipline AND 6 years technical engineering experience building web services with coding in languages including, but not limited to, Python, C#, C , Rust, Java
  • OR equivalent experience.
  • 6 years of experience working with high-scale training clusters (ex. working with frameworks/tools such as nvidia InfiniBand clusters, SLURM, Kubernetes, Ray, etc.)
  • 6 years' experience building scalable services on top of public cloud infrastructure like Azure, AWS, or GCP.

Preferred Qualifications:
  • Experience with LLM training clusters. 
  • Experience working with AI platforms, frameworks, and APIs.
  • Experience using Machine Learning frameworks, including experience using, deploying, and scaling language learning models, either personally or professionally.
  • Ability to identify, analyze, and resolve complex technical issues, ensuring optimal performance, scalability, and user experience.
  • Dedication to writing clean, maintainable, and well-documented code with a focus on application quality, performance, and security.
  • Demonstrated interpersonal skills and ability to work closely with cross-functional teams, including product managers, designers, and other engineers.
  • Ability to clearly communicate complex technical concepts to both technical and non-technical stakeholders.
  • Passion for learning new technologies and staying up to date with industry trends, best practices, and emerging technologies in web development and AI.
  • Ability to work in a fast-paced environment, manage multiple priorities, and adapt to changing requirements and deadlines.
  • Proven ability to collaborate and contribute to a positive, inclusive work environment, fostering knowledge sharing and growth within the team.
 
Software Engineering IC5 - The typical base pay range for this role across the U.S. is USD $137,600 - $267,000 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $180,400 - $294,000 per year.
Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:
Microsoft will accept applications and processes offers for these roles on an ongoing basis.

Similar Jobs

Google - Student Researcher, PhD, 2025

Google

Bengaluru, Karnataka, India (On-Site)
2 Weeks ago
Skillz - Senior Software Engineer (React Native - Mobile SDK)

Skillz

Las Vegas, Nevada, United States (On-Site)
1 Month ago
Google - Engineering Manager, Networking

Google

Bengaluru, Karnataka, India (On-Site)
2 Weeks ago
Rockstar Games - Software Engineer, C#/Java (All Levels)

Rockstar Games

Edinburgh, Scotland, United Kingdom (On-Site)
7 Months ago
Nagarro - Senior Staff Engineer, QA Automation

Nagarro

Portugal (Remote)
6 Months ago
N-iX - Senior DS/AI Engineer

N-iX

Poland (Hybrid)
1 Week ago
Google - Software Engineer, Compiler Frontend, Silicon

Google

Mountain View, California, United States (On-Site)
1 Week ago
Genies - Backend Engineer Intern (LLM)

Genies

San Mateo, California, United States (Hybrid)
1 Month ago
CharacterAI - Research Engineer, Post-Training

CharacterAI

New York, New York, United States (On-Site)
1 Month ago
Scale AI - Machine Learning Research Scientist / Research Engineer, MLDG

Scale AI

San Francisco, California, United States (On-Site)
6 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Axinous - Staff Software Development Engineer (Backend)

Axinous

Bengaluru, Karnataka, India (On-Site)
4 Months ago
ByteDance - Tech Lead Manager, Enterprise Solution

ByteDance

San Jose, California, United States (On-Site)
2 Weeks ago
Google - Staff Network Implementation Engineer, Design

Google

Atlanta, Georgia, United States (On-Site)
2 Weeks ago
N-iX - Senior AQA Engineer (with C# and Javascript)

N-iX

Poland (Remote)
2 Weeks ago
Warner Bros Games - Manager, Software Engineering

Warner Bros Games

Hyderabad, Telangana, India (Hybrid)
3 Months ago
Lirio - Sr Software Engineer

Lirio

(Remote)
7 Hours ago
P99 soft - Senior QA Automation Engineer

P99 soft

Hyderabad, Telangana, India (On-Site)
22 Hours ago
Veeam Software - Data Analytics Engineer

Veeam Software

(Remote)
1 Day ago
commerce iq - Software Development Engineer Testing II - Platform

commerce iq

Bengaluru, Karnataka, India (On-Site)
17 Hours ago
Zscaler - Sr Staff Software Development Engineer

Zscaler

Bengaluru, Karnataka, India (Hybrid)
8 Hours ago

Get notifed when new similar jobs are uploaded

Jobs in Mountain View, California, United States

Rivos - CPU Design/Verification - Intern

Rivos

Santa Clara, California, United States (On-Site)
6 Months ago
Snail Games - Production Assistant / Social Media Coordinator

Snail Games

Beverly Hills, California, United States (Hybrid)
1 Month ago
Intrepid Studios,  Inc  - Associate Systems Designer

Intrepid Studios, Inc

San Diego, California, United States (On-Site)
2 Months ago
Riot Games - Sr. Manager, Game Product Management

Riot Games

Los Angeles, California, United States (On-Site)
1 Day ago
Nintendo - Associate Service Desk Administrator

Nintendo

Redmond, Washington, United States (Hybrid)
2 Weeks ago
Samsung Semiconductor - Senior Engineer, DRAM

Samsung Semiconductor

San Jose, California, United States (Hybrid)
1 Month ago
Bethesda - UI/UX Designer (UI Scripter)

Bethesda

Austin, Texas, United States (On-Site)
3 Weeks ago
The Walt Disney Company - Software Engineer II - Site Reliability Engineer

The Walt Disney Company

New York, New York, United States (On-Site)
3 Days ago
ByteDance - Site Reliability Engineer, ML System

ByteDance

Seattle, Washington, United States (On-Site)
6 Months ago
Samsung Semiconductor - Principal, Emulation Lead

Samsung Semiconductor

San Jose, California, United States (Hybrid)
1 Month ago

Get notifed when new similar jobs are uploaded

Artificial Intelligence Jobs

Microsoft - Engineering Manager

Microsoft

Mountain View, California, United States (Hybrid)
1 Month ago
Google - Group Product Manager, Machine Learning Frameworks Applied Ecosystem

Google

Mountain View, California, United States (On-Site)
2 Days ago
Google - Customer Engineer IV, AI/ML, HCLS, Google Cloud

Google

Seattle, Washington, United States (On-Site)
1 Week ago
ClinDCast - GenAI Application Lead

ClinDCast

Austin, Texas, United States (Remote)
9 Months ago
Google - Software Engineer III, AI/ML GenAI, Search

Google

Mountain View, California, United States (On-Site)
2 Weeks ago
Meta - Software Engineer, Machine Learning

Meta

Pittsburgh, Pennsylvania, United States (On-Site)
5 Months ago
Google - Software Engineer III, Machine Learning, Google Ads

Google

Kirkland, Washington, United States (On-Site)
2 Weeks ago
Google - Software Engineer III, AI/ML, Cloud AI

Google

Sunnyvale, California, United States (On-Site)
2 Weeks ago
Microsoft - Member of Technical Staff, AI Pretraining

Microsoft

London, England, United Kingdom (On-Site)
1 Month ago
ByteDance - Machine Learning Scientist Graduate (Scaling AI for Biology (AI-for-Science))

ByteDance

Seattle, Washington, United States (On-Site)
2 Weeks ago

Get notifed when new similar jobs are uploaded

About The Company

Microsoft is a tech giant that develops, licenses, and supports a range of software products, services, and devices.

London, England, United Kingdom (On-Site)

Redmond, Washington, United States (On-Site)

Redmond, Washington, United States (Hybrid)

Shanghai, Shanghai, China (Hybrid)

Beijing, Beijing, China (On-Site)

Washington, United States (On-Site)

Phoenix, Arizona, United States (On-Site)

Penang, Malaysia (On-Site)

London, England, United Kingdom (On-Site)

View All Jobs

Get notified when new jobs are added by Microsoft

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug