Member of Technical Staff, AI Pretraining Platform

2 Weeks ago • All levels • Research & Development

Job Summary

Job Description

The Member of Technical Staff, AI Pretraining Platform at Microsoft AI will contribute to building a world-leading pre-training platform for developing cutting-edge AI models. Responsibilities involve designing and developing Python and CUDA/HIP C++ code for distributed training of multimodal LLMs, building and maintaining infrastructure to handle petabytes of data, collaborating with pre-training and post-training teams to optimize data pipelines, and partnering with product teams and researchers to identify model gaps. The role requires expertise in HPC, parallel programming, and experience with pre-training large AI models. This position is crucial for pushing the boundaries of AI model capabilities and powers the consumer Copilot experience.
Must have:
  • Python and CUDA/HIP C++ coding
  • HPC and parallel programming experience
  • Experience with AI model pre-training
  • Building and maintaining large-scale infrastructure
  • Collaborating with cross-functional teams

Job Details

Overview

Help build the world’s most advanced training platform at Microsoft AI 

We are on a mission to create the leading pretraining platform to develop the world’s most capable AI frontier models. This platform will span one of the world’s foremost GPU clusters, pushing the boundaries of scale, performance, and reliability. 

The AI Pre-training Platform team at Microsoft AI is responsible for all aspects of infrastructure including scalability, benchmarking, kernel development, performance optimizations, communications, and fault tolerance to support our model pre-training operations. We are an interdisciplinary team of engineers and scientists, learning from each other, and collaborating to create the best models, methods and products. We work closely with the teams that transform pre-trained models into the models that power the consumer Copilot experience. 

We are looking for outstanding individuals excited about contributing to the next generation of systems that will transform the field. We are looking for candidates who: 

  • Are passionate about the infrastructure enabling large-scale AI model training 
  • Will thrive in a highly collaborative, fast-paced environment 
  • Have a high degree of craftsmanship and pay close attention to details 
  • Demonstrate a proactive attitude and enthusiasm for exploring new methods and technologies 
  • Effectively manage multiple responsibilities and can adjust to shifting priorities.  

Qualifications

  • Bachelor's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND experience in business analytics, data science, software development, data modeling or data engineering work 
  • OR Master's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND experience in business analytics, data science, software development, or data engineering work 
  • OR equivalent experience. 
  • Experience with HPC (High performance computing) and/ or parallel programming.
  • Experience in the area of pretraining
  • Experience working with GPU clusters

 

 

 

#Copilot #MicrosoftAI

Responsibilities

  • Design and develop Python and CUDA/HIP C++ code that enable distributed training of multimodal LLMs ingesting text, audio, images, or video data. 
  • Build and maintain cutting-edge infrastructure that can store and process the petabytes of data needed to power models. 
  • Partner with the pretraining and post-training teams to improve our data recipe by rigorous and careful experimentation. 
  • Collaborate with the product team and other engineers and researchers across Microsoft AI to identify gaps in the current generation of models. 
  • Embody our and . 

Similar Jobs

Titmouse - Pipeline Technical Director

Titmouse

Los Angeles, California, United States (On-Site)
2 Months ago
NVIDIA - Research Scientist, Deep Learning and Computer Vision

NVIDIA

Taipei City, Taiwan (On-Site)
3 Months ago
NVIDIA - Senior Site Reliability Engineer - AI Research Clusters

NVIDIA

Austin, Texas, United States (Hybrid)
2 Months ago
Play Perfect - Senior BI Data Engineer

Play Perfect

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)
12 Hours ago
Rivos - CPU Design Verification - Full-time

Rivos

Bengaluru, Karnataka, India (Hybrid)
6 Months ago
Ubisoft - Principal R&D Scientist on Bots & Behaviors

Ubisoft

Bordeaux, Nouvelle-Aquitaine, France (Hybrid)
1 Month ago
Riot Games - Senior Software Engineer (Mobile C++) - Teamfight Tactics

Riot Games

Los Angeles, California, United States (On-Site)
2 Months ago
Meta - Software Engineer (Technical Leadership) - Machine Learning

Meta

Bellevue, Washington, United States (On-Site)
6 Months ago
NVIDIA - Senior Software Engineer, VLSI Design Tools

NVIDIA

Austin, Texas, United States (On-Site)
2 Months ago
Riot Games - Staff Software Engineer, Gameplay & Simulation

Riot Games

Los Angeles, California, United States (On-Site)
3 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Workato - Senior Data Engineer

Workato

Yerevan, Yerevan, Armenia (On-Site)
4 Days ago
SmileGate - Game Data Engineer

SmileGate

Seongnam-si, Gyeonggi-do, South Korea (On-Site)
1 Month ago
NVIDIA - Senior GPU Architect

NVIDIA

Austin, Texas, United States (On-Site)
2 Months ago
Netflix - Software Engineer (L5), N-Tech Software Engineering

Netflix

United States (Remote)
6 Months ago
Google - Software Engineer II, Filestore Control Plane

Google

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)
2 Weeks ago
Mistplay - Senior Data Analyst I, Trust & Safety

Mistplay

Montreal, Quebec, Canada (Hybrid)
1 Month ago
Tangle Wood Games - Senior Technical Animator

Tangle Wood Games

(Remote)
3 Weeks ago
Newzoo - Full Stack Python Developer

Newzoo

North Holland, Netherlands (Hybrid)
4 Months ago
HoYoverse - AI Product Management Intern

HoYoverse

Singapore, Singapore (On-Site)
5 Days ago

Get notifed when new similar jobs are uploaded

Jobs in Zürich, Zurich, Switzerland

PwC - Manager/Senior Manager - Operations and Supply Chain Management Consulting 80-100%

PwC

Zürich, Zurich, Switzerland (On-Site)
7 Months ago
PwC - Manager / Senior Manager-ITSM- Technology Strategy & Transformation Consulting

PwC

Zürich, Zurich, Switzerland (On-Site)
4 Months ago
Microsoft - Member of Technical Staff - AI Multimodal

Microsoft

Zürich, Zurich, Switzerland (On-Site)
2 Weeks ago
PwC - Senior Associate - SAP Global Trade Services

PwC

Zürich, Zurich, Switzerland (On-Site)
7 Months ago
Google - Sales Specialist, Go-To-Market, Alps, Google Cloud

Google

Zürich, Zurich, Switzerland (On-Site)
2 Weeks ago
Google - Site Reliability Engineer, Borg Node SRE

Google

Zürich, Zurich, Switzerland (On-Site)
2 Weeks ago
Niantic - Senior Software Engineer, Security

Niantic

Zürich, Zurich, Switzerland (Hybrid)
1 Month ago
Google - Software Engineer II, User Protections, Core

Google

Zürich, Zurich, Switzerland (On-Site)
2 Weeks ago
Interactive Brokers - Java Software Engineer

Interactive Brokers

Zug, Zug, Switzerland (On-Site)
6 Months ago

Get notifed when new similar jobs are uploaded

Research & Development Jobs

ByteDance - Site Reliability Engineer, ML System - Foundation Model

ByteDance

Seattle, Washington, United States (On-Site)
1 Month ago
Google - Senior Platform System Architect, Silicon

Google

New Taipei, New Taipei City, Taiwan (On-Site)
2 Weeks ago
Rockstar Games - Senior UI Programmer

Rockstar Games

Oakville, Ontario, Canada (On-Site)
1 Month ago
NVIDIA - Senior Architect, NVLink

NVIDIA

Massachusetts, United States (On-Site)
1 Month ago
Sony Interactive Entertainment - PlayStation向けカスタムLSIの開発・評価エンジニア

Sony Interactive Entertainment

Tokyo, Japan (On-Site)
6 Months ago
Google - Senior Software Engineer, Machine Learning (Recommendations, Rankings, and Predictions)

Google

Mountain View, California, United States (On-Site)
2 Weeks ago
Ubisoft - Lead R&D Programmer - La Forge

Ubisoft

Montreal, Quebec, Canada (Hybrid)
2 Weeks ago
Ceragon Networks - Senior Verification Engineer

Ceragon Networks

Karnataka, India (On-Site)
5 Months ago
ByteDance - Imaging Systems Lead - Smart Wearable

ByteDance

San Jose, California, United States (On-Site)
3 Weeks ago
NVIDIA - Senior Python Software Engineer, Security

NVIDIA

Bengaluru, Karnataka, India (Hybrid)
3 Weeks ago

Get notifed when new similar jobs are uploaded

About The Company

Microsoft is a tech giant that develops, licenses, and supports a range of software products, services, and devices.

Penang, Malaysia (On-Site)

Vancouver, British Columbia, Canada (On-Site)

Mountain View, California, United States (Hybrid)

Bengaluru, Karnataka, India (Hybrid)

Shenzhen, Guangdong Province, China (On-Site)

Redmond, Washington, United States (On-Site)

Noida, Uttar Pradesh, India (On-Site)

View All Jobs

Get notified when new jobs are added by Microsoft

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug