Member of Technical Staff, AI Pretraining Platform

23 Minutes ago • All levels • Artificial Intelligence

Job Summary

Job Description

Microsoft AI is seeking a Member of Technical Staff to contribute to their cutting-edge AI pre-training platform. This role involves designing and developing Python and CUDA/HIP C++ code for distributed training of multimodal LLMs, building and maintaining infrastructure for petabyte-scale data processing, partnering with other teams to improve data recipes, and collaborating on identifying gaps in current models. Responsibilities include optimizing for scalability, performance, and reliability on a large-scale GPU cluster. The ideal candidate will be passionate about large-scale AI infrastructure, thrive in a fast-paced collaborative environment, and demonstrate a high degree of craftsmanship.
Must have:
  • Python & CUDA/HIP C++ development
  • Experience with HPC and parallel programming
  • Large-scale AI model training experience
  • GPU cluster experience

Job Details


Job Description

Help build the world’s most advanced training platform at Microsoft AI 

We are on a mission to create the leading pretraining platform to develop the world’s most capable AI frontier models. This platform will span one of the world’s most foremost GPU clusters, pushing the boundaries of scale, performance, and reliability. 

The AI Pre-training Platform team at Microsoft AI is responsible for all aspects of infrastructure including scalability, benchmarking, kernel development, performance optimizations, communications, and fault tolerance to support our model pre-training operations. We are an interdisciplinary team of engineers and scientists, learning from each other, and collaborating to create the best models, methods and products. We work closely with the teams that transform pre-trained models into the models that power the consumer Copilot experience. 

We are looking for outstanding individuals excited about contributing to the next generation of systems that will transform the field. We are looking for candidates who: 
  • Are passionate about the infrastructure enabling large-scale AI model training 
  • Will thrive in a highly collaborative, fast-paced environment 
  • Have a high degree of craftsmanship and pay close attention to details 
  • Demonstrate a proactive attitude and enthusiasm for exploring new methods and technologies 
  • Effectively manage multiple responsibilities and can adjust to shifting priorities.  
 
Responsibilities 
  • Design and develop Python and CUDA/HIP C code that enable distributed training of multimodal LLMs ingesting text, audio, images, or video data. 
  • Build and maintain cutting-edge infrastructure that can store and process the petabytes of data needed to power models. 
  • Partner with the pretraining and post-training teams to improve our data recipe by rigorous and careful experimentation. 
  • Collaborate with the product team and other engineers and researchers across Microsoft AI to identify gaps in the current generation of models. 
  • Embody our and
 

Required/Minimum Qualifications  
  • Bachelor's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND experience in business analytics, data science, software development, data modeling or data engineering work 
  • OR Master's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND experience in business analytics, data science, software development, or data engineering work 
  • Experience with HPC (High performance computing) and/ or parallel programming?
  • Experience in the area of pretraining
  • Experience working with GPU clusters

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request via the .
 
Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.
 
#Copilot #MicrosoftAI


Similar Jobs

Playrix - Senior Unity Software Engineer (Gameplay)

Playrix

Almaty, Almaty Region, Kazakhstan (Remote)
5 Months ago
Netflix - Sr. QA Test Lead

Netflix

United States (Remote)
2 Days ago
PwC - IN_Senior Associate_GCP Data Engineer_Data and  Analytics_Advisory_Bengaluru

PwC

Bengaluru, Karnataka, India (On-Site)
6 Months ago
NVIDIA - System Software Engineer Intern, Autonomous Vehicles - 2025

NVIDIA

Shenzhen, Guangdong Province, China (On-Site)
1 Month ago
PENN Interactive - Staff Software Developer, Pricing Engine

PENN Interactive

Philadelphia, Pennsylvania, United States (Hybrid)
3 Months ago
NVIDIA - Senior AI-HPC Cluster Engineer

NVIDIA

Santa Clara, California, United States (Hybrid)
3 Weeks ago
Alpha Sense - Lead AI Platform Engineer

Alpha Sense

New York, New York, United States (On-Site)
4 Months ago
Genies - Research Scientist Intern - LLM/Vision/Speech

Genies

San Mateo, California, United States (Hybrid)
1 Month ago
Google - Senior AI Sales Specialist, Google Cloud

Google

Tokyo, Japan (On-Site)
17 Hours ago
AI Fund - General Manager - New Business Unit (College Admissions)

AI Fund

California, United States (Remote)
6 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Google - Software Engineer III, Pixel Connectivity

Google

New Taipei, New Taipei City, Taiwan (On-Site)
1 Day ago
NVIDIA - Senior Staff Software Engineer - Observability and Monitoring

NVIDIA

Santa Clara, California, United States (On-Site)
2 Weeks ago
Microsoft - Senior Incident Response Engineer

Microsoft

Tel Aviv-Yafo, Tel Aviv District, Israel (Remote)
2 Days ago
NVIDIA - Senior Math Libraries Engineers - Python APIs

NVIDIA

Santa Clara, California, United States (Remote)
2 Months ago
Saama Technologies,  Inc  - Senior Site Reliability Engineer

Saama Technologies, Inc

Chennai, Tamil Nadu, India (On-Site)
6 Months ago
Google - Software Engineer, Early Career, Campus

Google

State Of Minas Gerais, Brazil (On-Site)
1 Day ago
Netflix - Data Engineer 5 - Playback

Netflix

United States (Remote)
2 Days ago
Google - Software Engineering Manager, Black Community Inclusion

Google

São Paulo, State Of São Paulo, Brazil (On-Site)
5 Months ago
MPOWER Financing - Senior Quality Assurance Automation Engineer

MPOWER Financing

Bengaluru, Karnataka, India (Hybrid)
6 Months ago

Get notifed when new similar jobs are uploaded

Jobs in London, England, United Kingdom

ION - Senior Business Consultant - Allegro​

ION

London, England, United Kingdom (On-Site)
6 Months ago
King - Legal Counsel - Employment

King

London, England, United Kingdom (Hybrid)
2 Weeks ago
Trailmix Games - Senior Game Developer

Trailmix Games

London, England, United Kingdom (Hybrid)
4 Weeks ago
Media Molecule - Senior Designer (Environments)

Media Molecule

London, England, United Kingdom (Hybrid)
3 Weeks ago
Maverick Games - Principal UI Engineer

Maverick Games

Warwick, England, United Kingdom (On-Site)
3 Months ago
Cloud Imperium Games - Level Designer

Cloud Imperium Games

Manchester, England, United Kingdom (On-Site)
1 Week ago
Reversing Labs - Inside Sales Representative, EMEA

Reversing Labs

United Kingdom (Remote)
4 Weeks ago
Playground Games - Lead Gameplay Animator

Playground Games

England, United Kingdom (Hybrid)
4 Months ago
Gunzilla - Senior Technical VFX Artist

Gunzilla

London, England, United Kingdom (On-Site)
3 Weeks ago
Google - Program Manager, Regional Risk and Compliance

Google

London, England, United Kingdom (On-Site)
1 Day ago

Get notifed when new similar jobs are uploaded

Artificial Intelligence Jobs

Tencent - Senior Researcher: Artificial General Intelligence (Natural Language Processing)

Tencent

Bellevue, Washington, United States (On-Site)
8 Months ago
KPIT - CTO_ML/DL Data scientist

KPIT

Pune, Maharashtra, India (On-Site)
5 Months ago
Social Discovery Group - Senior NLP Engineer

Social Discovery Group

Poland (Remote)
5 Months ago
Google - Student Researcher, BS/MS, Winter/Summer 2025

Google

Ann Arbor, Michigan, United States (On-Site)
5 Months ago
Google - Senior Software Engineer, Core Machine Learning, Google Cloud

Google

New York, New York, United States (On-Site)
5 Months ago
Lionbridge Games - Language AI (Games) Program Manager

Lionbridge Games

Masovian Voivodeship, Poland (On-Site)
2 Months ago
Netflix - Software Engineer L4/L5, Model Serving Systems, Machine Learning Platform

Netflix

Los Gatos, California, United States (Remote)
3 Months ago
NVIDIA - Solutions Architect - Generative AI

NVIDIA

Seoul, South Korea (Hybrid)
3 Weeks ago

Get notifed when new similar jobs are uploaded

About The Company

Microsoft is a tech giant that develops, licenses, and supports a range of software products, services, and devices.

Hyderabad, Telangana, India (On-Site)

Sydney, New South Wales, Australia (Hybrid)

Redmond, Washington, United States (On-Site)

New York, New York, United States (Hybrid)

Redmond, Washington, United States (On-Site)

London, England, United Kingdom (On-Site)

Hyderabad, Telangana, India (On-Site)

View All Jobs

Get notified when new jobs are added by Microsoft

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug