Research Intern - LLM Inference Acceleration and Optimization

1 Month ago • Upto 1 Years • $78,600 PA - $154,560 PA

Job Summary

Job Description

This Research Internship at Microsoft's AIFX team focuses on accelerating and optimizing Large Language Model (LLM) inference. Interns will investigate and implement cutting-edge techniques like quantized KV-caches, flash/paged/radix attention, speculative decoding, and advanced collective communication on GPUs. The work involves leveraging state-of-the-art approaches like "You only cache once (YOCO)" to improve LLM serving efficiency at scale. The internship includes exploring, implementing, optimizing, and potentially publishing research findings related to real-world production workloads. Collaboration with Microsoft teams and contributions to open-source projects like vLLM, SGLang, and HuggingFace are key aspects of this role.
Must have:
  • PhD in CS or related field
  • 6+ months LLM training/inference experience
  • Experience with LLMs like Llama and Phi
  • Ability to convert research ideas into code
Good to have:
  • Experience with large-scale GPU communication
  • AI framework benchmarking experience (Pytorch, vLLM, SGLang)
  • Proficient interpersonal skills
  • Open to fast iteration and ambitious ideas
Perks:
  • Industry leading healthcare
  • Educational resources
  • Product and service discounts
  • Savings and investments
  • Maternity/paternity leave
  • Generous time away
  • Giving programs
  • Networking opportunities

Job Details

Overview

Research Internships at Microsoft provide a dynamic environment for research careers with a network of world-class research labs led by globally-recognized scientists and engineers, who pursue innovation in a range of scientific and technical disciplines to help solve complex challenges in diverse fields, including computing, healthcare, economics, and the environment.

If you are excited about investigating and implementing cutting-edge large language model (LLM) inference techniques and optimizations like quantized KV-caches, flash/paged/radix attention, speculative decoding, and advanced collective communication on graphics processing units (GPUs), come join the AIFX team at Microsoft Azure and contribute to a production-focused, planetary-scale LLM serving stack that is being built on top of excellent open-source efforts like vLLM, SGLang, and HuggingFace. The work includes investigation of cutting-edge, state-of-the-art approaches like "You only cache once (YOCO)" and leveraging them to save memory and compute for serving LLMs at scale. You will get a chance to explore, implement, optimize, and publish your research ideas in collaboration with teams at Microsoft working on real-world production workloads at an unprecedented scale.

Qualifications

Required Qualifications

  • Accepted or currently enrolled in a PhD program in Computer Science or related STEM field.
  • At least 6 months of experience with training and/or inference of recent LLMs like Llama and Phi.

Other Requirements

  • Research Interns are expected to be physically located in their manager’s Microsoft worksite location for the duration of their internship.
  • In addition to the qualifications below, you’ll need to submit a minimum of two reference letters for this position as well as a cover letter and any relevant work or research samples. After you submit your application, a request for letters may be sent to your list of references on your behalf. Note that reference letters cannot be requested until after you have submitted your application, and furthermore, that they might not be automatically requested for all candidates. You may wish to alert your letter writers in advance, so they will be ready to submit your letter. 

Preferred Qualifications

  • Experience with large-scale collective communication on GPUs.
  • Experience with performance benchmarking of AI frameworks like Pytorch, vLLM, and/or SGLang.
  • Ability to convert research ideas into working code that runs and scales on real systems.
  • Proficient interpersonal skills and growth mindset.
  • Open to failing fast in pursuit of ambitious ideas.

The base pay range for this internship is USD $6,550 - $12,880 per month. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $8,480 - $13,920 per month.

 

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: 

Microsoft accepts applications and processes offers for these roles on an ongoing basis.

  •  

Responsibilities

Research Interns put inquiry and theory into practice. Alongside fellow doctoral candidates and some of the world’s best researchers, Research Interns learn, collaborate, and network for life. Research Interns not only advance their own careers, but they also contribute to exciting research and development strides. During the 12-week internship, Research Interns are paired with mentors and expected to collaborate with other Research Interns and researchers, present findings, and contribute to the vibrant life of the community. Research internships are available in all areas of research, and are offered year-round, though they typically begin in the summer.

Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.
Industry leading healthcare
Educational resources
Discounts on products and services
Savings and investments
Maternity and paternity leave
Generous time away
Giving programs
Opportunities to network and connect

Similar Jobs

Microsoft - Senior DPU Software Engineer – Secure Enclave

Microsoft

Bengaluru, Karnataka, India (On-Site)
1 Month ago
Hitachi - Azure Developer

Hitachi

Hyderabad, Telangana, India (Remote)
4 Months ago
Microsoft - SR. Hardware Manufacturing PM

Microsoft

Taipei City, Taiwan (On-Site)
1 Month ago
Microsoft - ROP - Cloud Network Engineer

Microsoft

Hyderabad, Telangana, India (On-Site)
1 Month ago
version 1 - .Net Technical Lead

version 1

Belfast, Northern Ireland, United Kingdom (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

ARHS - IT Support Officer

ARHS

Amsterdam, North Holland, Netherlands (On-Site)
4 Months ago
PwC - Cloud & IT Transformation Senior Associates

PwC

Makati, Metro Manila, Philippines (On-Site)
4 Months ago
Flatworld Solutions - Technical Architect

Flatworld Solutions

Bengaluru, Karnataka, India (Hybrid)
5 Months ago
Hitachi - Azure Developer

Hitachi

Hyderabad, Telangana, India (Remote)
4 Months ago
The Pokemon Company International - Automated Retail Service Desk Engineer

The Pokemon Company International

Redmond, Washington, United States (Hybrid)
3 Weeks ago
HP - Senior Cloud Engineer Technical Lead

HP

Sant Cugat Del Vallès, Catalonia, Spain (On-Site)
3 Months ago
Microsoft - Senior Hardware Engineer

Microsoft

Bengaluru, Karnataka, India (On-Site)
1 Month ago
Virtuos - Global Senior System Administrator

Virtuos

France (On-Site)
4 Months ago
PwC - IN_Senior Associate_D365 CRM Technical_MS Dynamics_Advisory_ Mumbai

PwC

Mumbai, Maharashtra, India (On-Site)
2 Months ago
HP - Principal Software Engineer

HP

Boise, Idaho, United States (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Jobs in Redmond, Washington, United States

Extreme Network - Regional Account Executive - Heartland

Extreme Network

Minnesota, United States (Remote)
4 Months ago
The Walt Disney Company - Sr. Hulu Strategic Operations Analyst

The Walt Disney Company

Santa Monica, California, United States (On-Site)
2 Months ago
Scientific Games  - Business Intelligence Analyst

Scientific Games

Solon, Ohio, United States (On-Site)
2 Weeks ago
The Walt Disney Company - Hulu Strategic Portfolio Operations Manager

The Walt Disney Company

Santa Monica, California, United States (On-Site)
1 Month ago
Nielsen Holdings - Field Sales Representative

Nielsen Holdings

Alexandria, Virginia, United States (Hybrid)
2 Months ago
SSC Technologies - Associate Manager - Private Equity/Real Estate Accounting

SSC Technologies

Atlanta, Georgia, United States (Hybrid)
4 Months ago
PTW - 3D Environment Artist - Talent Pool

PTW

United States (Remote)
1 Month ago
Nintendo - Lead Sound Designer (NST)

Nintendo

Redmond, Washington, United States (Hybrid)
8 Months ago
Sitetracker - Salesforce Engineer (EDS)

Sitetracker

Montclair, New Jersey, United States (Remote)
4 Months ago
Paypal - Senior Engineer, Backend (Java)

Paypal

San Jose, California, United States (Hybrid)
4 Months ago

Get notifed when new similar jobs are uploaded

Similar Category Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

About The Company

Microsoft is a tech giant that develops, licenses, and supports a range of software products, services, and devices.

Mountain View, California, United States (Hybrid)

Mountain View, California, United States (Hybrid)

Mountain View, California, United States (Hybrid)

New York, New York, United States (Hybrid)

Mountain View, California, United States (Hybrid)

Mountain View, California, United States (Hybrid)

London, England, United Kingdom (On-Site)

Dublin, County Dublin, Ireland (On-Site)

Mountain View, California, United States (Hybrid)

View All Jobs

Get notified when new jobs are added by Microsoft

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug