Research Intern - Multimodal AI Research

31 Minutes ago • 1 Years + • Artificial Intelligence • $78,600 PA - $154,560 PA

Job Summary

Job Description

Microsoft's AI Platform team seeks Research Interns for its Multimodal Intelligence team. The internship focuses on cutting-edge research in multimodal AI, encompassing video, image, and document understanding. Responsibilities include collaborating with researchers, presenting findings, and contributing to projects like video understanding, information retrieval, and leveraging LLMs for improved document/video/image understanding. The ideal candidate possesses a PhD background in a relevant field (Computer Vision, NLP, etc.), at least one year of hands-on deep learning experience, and proficiency in Python and relevant tools (PyTorch). Publication in top-tier conferences is a plus. The internship is a 12-week program based in Redmond, Washington.
Must have:
  • PhD in relevant field
  • 1+ year deep learning experience
  • Proficiency in Python
  • NLP/CV background
Good to have:
  • Publications in top-tier conferences
  • Experience with PyTorch
  • Familiarity with LLMs/VLMs

Job Details

Overview

Research Internships at Microsoft provide a dynamic environment for research careers with a network of world-class research labs led by globally-recognized scientists and engineers, who pursue innovation in a range of scientific and technical disciplines to help solve complex challenges in diverse fields, including computing, healthcare, economics, and the environment.

The AI Platform team is on a mission to advance the state of the art in AI and deliver on our company’s vision for how intelligent cloud and intelligent edge will shape the next phase of innovation. The team includes top scientists and researchers from across Microsoft who are creating a center of excellence in speech, computer vision, and natural language.

 

Within the AI Platform, the Multi-modal Intelligence team (MMI) mission is to make fundamental contributions to advancing the state-of-the-art in AI technology related to Video, Image, Document, and other multimodality inputs. “Documents”, for example, stand at the intersection between NLP and Vision research. To fully understand a document, one needs to borrow from both language and visual (Layout) elements of the document. We explore both single and multimodality inputs – and their synergy - to conduct research on forward-looking topics such as Video Understanding, Information Retrieval, Key-Value extraction, few-shot Named Entity Recognition (NER), hierarchical layout analysis, and many others. 

 

We are looking for Research Interns to work on cutting edge research in Multimodal AI. We are particularly interested in Research Interns with background in AI, NLP, and/or CV, including topics like Video/image understanding, document layout analysis, chart understanding, multi-page multi-document question answering, novel ways of leveraging LLMs for document/video/image understanding and solving problems inherent to large language models (grounding, retrieval-based generation, etc.). Familiarity with modern LLMs/VLMs is a plus, but not required.  

 

Qualifications

Required Qualifications

  • Currently enrolled in a PhD program in Computer Vision, Natural Language Processing, Deep Learning, Machine Learning, AI, or a related field.
  • At least 1 year of experience in NLP, computer vision, Deep learning, or multimodal research with hands-on deep learning experience.

Other Requirements

  • Research Interns are expected to be physically located in their manager’s Microsoft worksite location for the duration of their internship.
  • In addition to the qualifications below, you’ll need to submit a minimum of two reference letters for this position as well as a cover letter and any relevant work or research samples. After you submit your application, a request for letters may be sent to your list of references on your behalf. Note that reference letters cannot be requested until after you have submitted your application, and furthermore, that they might not be automatically requested for all candidates. You may wish to alert your letter writers in advance, so they will be ready to submit your letter. 

Preferred Qualifications

  • Proficient algorithmic problem solving and software development skills (Python, C/C++, etc.).
  • Experience with open-source tools such as PyTorch, etc.
  • Publication(s) in top-tier conferences or journals in related fields (e.g., ACL, CVPR, ECCV, ICCV, EMNLP, NAACL, NIPS, ICML, ICLR, IJCV, PAMI, etc.). 

The base pay range for this internship is USD $6,550 - $12,880 per month. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $8,480 - $13,920 per month.

 

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: 

Microsoft accepts applications and processes offers for these roles on an ongoing basis.

Responsibilities

Research Interns put inquiry and theory into practice. Alongside fellow doctoral candidates and some of the world’s best researchers, Research Interns learn, collaborate, and network for life. Research Interns not only advance their own careers, but they also contribute to exciting research and development strides. During the 12-week internship, Research Interns are paired with mentors and expected to collaborate with other Research Interns and researchers, present findings, and contribute to the vibrant life of the community. Research internships are available in all areas of research, and are offered year-round, though they typically begin in the summer.

Similar Jobs

PlayStation Global - Staff Machine Learning Engineer, Anomaly Detection

PlayStation Global

London, England, United Kingdom (Hybrid)
5 Months ago
Google - Student Researcher, PhD, Winter/Summer 2025

Google

Waterloo, Ontario, Canada (On-Site)
5 Months ago
Luxoft - Senior ML Engineer

Luxoft

Poland, Ohio, United States (Remote)
3 Months ago
ByteDance - Research Scientist in Foundation Model (Music) - 2025 Start (PhD)

ByteDance

San Jose, California, United States (On-Site)
5 Months ago
NVIDIA - System Software Architect, Programmable Vision Accelerator

NVIDIA

Bengaluru, Karnataka, India (On-Site)
2 Months ago
NetEase Games - Senior AI Engineer (Asset Creation)

NetEase Games

(Remote)
3 Months ago
NVIDIA - Software Engineer Intern - Mapping and Generative AI

NVIDIA

Shanghai, Shanghai, China (On-Site)
2 Months ago
Google - Software Engineer III, AI/ML, Google Cloud

Google

(On-Site)
4 Months ago
SiftHub - Senior NLP Engineer

SiftHub

Maharashtra, India (On-Site)
7 Months ago
Microsoft - Senior Data Scientist

Microsoft

(On-Site)
6 Hours ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

ByteDance - Student Researcher (Doubao (Seed) - Foundation Model - Speech & Audio) - 2025 Start (PhD)

ByteDance

Seattle, Washington, United States (On-Site)
5 Months ago
ByteDance - Senior Research Engineer, 3D vision

ByteDance

San Jose, California, United States (On-Site)
4 Months ago
Mashgin - Senior Software Engineer, Computer Vision and Deep Learning

Mashgin

Palo Alto, California, United States (Hybrid)
6 Months ago
GameChanger  - Staff Backend Engineer, Video Enablement

GameChanger

United States (Remote)
3 Weeks ago
ByteDance - Research Scientist Graduate (Edge Research Project for General Intelligence)

ByteDance

San Jose, California, United States (On-Site)
2 Days ago
Google - Senior Software Engineer, AI/ML GenAI, Google Cloud AI

Google

Kirkland, Washington, United States (On-Site)
2 Days ago
Meta - Research Scientist Intern, Smart Glasses in Wearables AI (PhD)

Meta

Redmond, Washington, United States (On-Site)
5 Months ago
NVIDIA - System Software Architect, Programmable Vision Accelerator

NVIDIA

Hyderabad, Telangana, India (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Jobs in Redmond, Washington, United States

Adobe - Senior Staff Experience Designer, Stock

Adobe

San Francisco, California, United States (Remote)
3 Weeks ago
Patreon - Senior iOS Engineer

Patreon

San Francisco, California, United States (Hybrid)
3 Weeks ago
Corsair - Supply Chain Program Manager / Sr. Manager

Corsair

Milpitas, California, United States (On-Site)
2 Days ago
NVIDIA - Senior Manager, Vendor Management

NVIDIA

Santa Clara, California, United States (On-Site)
1 Month ago
DraftKings - Risk Payment Operations Associate

DraftKings

Las Vegas, Nevada, United States (On-Site)
4 Days ago
Rapt Studio - Senior Designer (Interior Design/Architecture)

Rapt Studio

Los Angeles, California, United States (Hybrid)
6 Months ago
ByteDance - Backend Software Engineer - Global E-Commerce Supply Chain Merchant Platform

ByteDance

San Jose, California, United States (On-Site)
5 Months ago
Snail Games - Game Scout (Business Development)

Snail Games

Beverly Hills, California, United States (On-Site)
4 Months ago
The Walt Disney Company - Salon Stylist - Part Time

The Walt Disney Company

Florida, United States (On-Site)
2 Days ago
Epic Games - Senior C++ Engineer

Epic Games

Cary, North Carolina, United States (On-Site)
2 Days ago

Get notifed when new similar jobs are uploaded

Artificial Intelligence Jobs

Microsoft - Product Management Lead

Microsoft

Redmond, Washington, United States (On-Site)
42 Minutes ago
Microsoft - Senior Researcher

Microsoft

Singapore (On-Site)
37 Minutes ago
Meta - Software Engineer, Systems ML - SW/HW Co-design

Meta

Austin, Texas, United States (On-Site)
5 Months ago
NVIDIA - AI Network System Architect

NVIDIA

Yokne'am Illit, North District, Israel (On-Site)
5 Days ago
NetEase Games - Game AI Research Leader

NetEase Games

Singapore (On-Site)
2 Months ago
FTF Studios - FTF Senior Programmer

FTF Studios

(Remote)
1 Year ago
Microsoft - Member of Technical Staff - AI Multimodal

Microsoft

Zürich, Zurich, Switzerland (On-Site)
6 Hours ago
NetEase Games - Game AI Research Leader

NetEase Games

Singapore (On-Site)
3 Weeks ago
Google - Software Engineer III, AI/ML, Google Cloud

Google

Bengaluru, Karnataka, India (On-Site)
5 Months ago
Inworld AI - Staff C++ Engineer

Inworld AI

Mountain View, California, United States (On-Site)
2 Weeks ago

Get notifed when new similar jobs are uploaded

About The Company

Microsoft is a tech giant that develops, licenses, and supports a range of software products, services, and devices.

Redmond, Washington, United States (Hybrid)

New York, New York, United States (On-Site)

Redmond, Washington, United States (On-Site)

Beijing, Beijing, China (On-Site)

Hyderabad, Telangana, India (On-Site)

Barcelona, Catalonia, Spain (On-Site)

Prague, Prague, Czechia (Hybrid)

Prague, Prague, Czechia (Hybrid)

São Paulo, State Of São Paulo, Brazil (On-Site)

View All Jobs

Get notified when new jobs are added by Microsoft

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug