Research Intern - Multimodal AI Research

1 Month ago • 1 Years + • Artificial Intelligence • $78,600 PA - $154,560 PA

Job Summary

Job Description

Microsoft's AI Platform team seeks Research Interns for its Multimodal Intelligence team. The internship focuses on cutting-edge research in multimodal AI, encompassing video, image, and document understanding. Responsibilities include collaborating with researchers, presenting findings, and contributing to projects like video understanding, information retrieval, and leveraging LLMs for improved document/video/image understanding. The ideal candidate possesses a PhD background in a relevant field (Computer Vision, NLP, etc.), at least one year of hands-on deep learning experience, and proficiency in Python and relevant tools (PyTorch). Publication in top-tier conferences is a plus. The internship is a 12-week program based in Redmond, Washington.
Must have:
  • PhD in relevant field
  • 1+ year deep learning experience
  • Proficiency in Python
  • NLP/CV background
Good to have:
  • Publications in top-tier conferences
  • Experience with PyTorch
  • Familiarity with LLMs/VLMs

Job Details

Overview

Research Internships at Microsoft provide a dynamic environment for research careers with a network of world-class research labs led by globally-recognized scientists and engineers, who pursue innovation in a range of scientific and technical disciplines to help solve complex challenges in diverse fields, including computing, healthcare, economics, and the environment.

The AI Platform team is on a mission to advance the state of the art in AI and deliver on our company’s vision for how intelligent cloud and intelligent edge will shape the next phase of innovation. The team includes top scientists and researchers from across Microsoft who are creating a center of excellence in speech, computer vision, and natural language.

 

Within the AI Platform, the Multi-modal Intelligence team (MMI) mission is to make fundamental contributions to advancing the state-of-the-art in AI technology related to Video, Image, Document, and other multimodality inputs. “Documents”, for example, stand at the intersection between NLP and Vision research. To fully understand a document, one needs to borrow from both language and visual (Layout) elements of the document. We explore both single and multimodality inputs – and their synergy - to conduct research on forward-looking topics such as Video Understanding, Information Retrieval, Key-Value extraction, few-shot Named Entity Recognition (NER), hierarchical layout analysis, and many others. 

 

We are looking for Research Interns to work on cutting edge research in Multimodal AI. We are particularly interested in Research Interns with background in AI, NLP, and/or CV, including topics like Video/image understanding, document layout analysis, chart understanding, multi-page multi-document question answering, novel ways of leveraging LLMs for document/video/image understanding and solving problems inherent to large language models (grounding, retrieval-based generation, etc.). Familiarity with modern LLMs/VLMs is a plus, but not required.  

 

Qualifications

Required Qualifications

  • Currently enrolled in a PhD program in Computer Vision, Natural Language Processing, Deep Learning, Machine Learning, AI, or a related field.
  • At least 1 year of experience in NLP, computer vision, Deep learning, or multimodal research with hands-on deep learning experience.

Other Requirements

  • Research Interns are expected to be physically located in their manager’s Microsoft worksite location for the duration of their internship.
  • In addition to the qualifications below, you’ll need to submit a minimum of two reference letters for this position as well as a cover letter and any relevant work or research samples. After you submit your application, a request for letters may be sent to your list of references on your behalf. Note that reference letters cannot be requested until after you have submitted your application, and furthermore, that they might not be automatically requested for all candidates. You may wish to alert your letter writers in advance, so they will be ready to submit your letter. 

Preferred Qualifications

  • Proficient algorithmic problem solving and software development skills (Python, C/C++, etc.).
  • Experience with open-source tools such as PyTorch, etc.
  • Publication(s) in top-tier conferences or journals in related fields (e.g., ACL, CVPR, ECCV, ICCV, EMNLP, NAACL, NIPS, ICML, ICLR, IJCV, PAMI, etc.). 

The base pay range for this internship is USD $6,550 - $12,880 per month. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $8,480 - $13,920 per month.

 

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: 

Microsoft accepts applications and processes offers for these roles on an ongoing basis.

Responsibilities

Research Interns put inquiry and theory into practice. Alongside fellow doctoral candidates and some of the world’s best researchers, Research Interns learn, collaborate, and network for life. Research Interns not only advance their own careers, but they also contribute to exciting research and development strides. During the 12-week internship, Research Interns are paired with mentors and expected to collaborate with other Research Interns and researchers, present findings, and contribute to the vibrant life of the community. Research internships are available in all areas of research, and are offered year-round, though they typically begin in the summer.

Similar Jobs

HP - Machine Learning Intern

HP

Austin, Texas, United States (On-Site)
8 Months ago
ByteDance - Student Researcher (Doubao (Seed) - Foundation Model - Speech & Audio) - 2025 Start (PhD)

ByteDance

Seattle, Washington, United States (On-Site)
6 Months ago
Trackman - Sales Representative - Houston & East Texas

Trackman

Houston, Texas, United States (Hybrid)
3 Months ago
ByteDance - Student Researcher Intern (Edge Research Project for General Intelligence)

ByteDance

San Jose, California, United States (On-Site)
1 Month ago
Keywords Studios - AI - Senior Research Associate (Prompts)

Keywords Studios

Silesian Voivodeship, Poland (On-Site)
2 Months ago
Google - Cloud Engineer II, AI/ML, Professional Services

Google

Mexico City, Mexico City, Mexico (On-Site)
1 Month ago
Google - Staff Software Engineer, Applied AI

Google

Kraków, Lesser Poland Voivodeship, Poland (On-Site)
1 Month ago
Google - Technical Program Manager III, Infrastructure Resource Analytics

Google

Sunnyvale, California, United States (On-Site)
1 Month ago
ByteDance - Research Scientist, Foundation Model, Vision

ByteDance

Singapore (On-Site)
6 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Mashgin - Senior Software Engineer, Computer Vision and Deep Learning

Mashgin

Palo Alto, California, United States (Hybrid)
7 Months ago
NVIDIA - Senior Developer Technology Engineer - AI

NVIDIA

Santa Clara, California, United States (Hybrid)
2 Months ago
Niantic - Senior Computer Vision Software Engineer

Niantic

London, England, United Kingdom (Hybrid)
2 Months ago
DNEG - Head of Machine Learning

DNEG

London, England, United Kingdom (Remote)
3 Months ago
Netflix - Data Engineer (L5) - Content Production & Promotion

Netflix

United States (Remote)
1 Month ago
Arrise Solutions (India)   - Senior Data Scientist (Remote)

Arrise Solutions (India)

Hyderabad, Telangana, India (Remote)
7 Months ago
Google - Software Engineer, Early Career, iOS, Photos

Google

Sydney, New South Wales, Australia (On-Site)
1 Month ago
ByteDance - Solutions Architect

ByteDance

Riyadh, Riyadh Province, Saudi Arabia (On-Site)
1 Month ago
Meta - Research Scientist Intern, Machine Perception for Input and Interaction (PhD)

Meta

Redmond, Washington, United States (On-Site)
6 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Redmond, Washington, United States

Daybreak Game Company LLC - Environment Artist

Daybreak Game Company LLC

Austin, Texas, United States (Remote)
3 Months ago
Google - Administrative Business Partner I

Google

San Bruno, California, United States (On-Site)
1 Month ago
ByteDance - Student Researcher (Doubao (Seed) - Foundation Model - Generative AI)

ByteDance

Seattle, Washington, United States (On-Site)
1 Month ago
PENN Interactive - Senior Manager, Product Intelligence, AI/ML & Data Solutions

PENN Interactive

Philadelphia, Pennsylvania, United States (Hybrid)
2 Months ago
AGS - American Gaming Systems - Field Service Technician II

AGS - American Gaming Systems

Detroit, Michigan, United States (On-Site)
1 Month ago
Nintendo - CONTRACT - Localization Product Specialist III (Spanish)

Nintendo

Redmond, Washington, United States (Hybrid)
6 Months ago
The Walt Disney Company - Transportation Bus Maintenance - Area Fleet B Mechanic

The Walt Disney Company

Florida, United States (On-Site)
3 Months ago
Zoox - Technical Program Manager - Artificial Intelligence

Zoox

Foster City, California, United States (Hybrid)
7 Months ago
NVIDIA - Senior Firmware Engineer - Embedded Controller

NVIDIA

Santa Clara, California, United States (On-Site)
1 Month ago
The Walt Disney Company - Digital Research Analyst - Sports

The Walt Disney Company

New York, New York, United States (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Artificial Intelligence Jobs

The Walt Disney Company - Lead Applied AI Engineer

The Walt Disney Company

Santa Monica, California, United States (On-Site)
1 Month ago
ByteDance - Software Engineer Intern (Doubao (Seed) - Machine Learning System) - 2025 Summer (PhD)

ByteDance

Seattle, Washington, United States (On-Site)
6 Months ago
NVIDIA - Senior Software Engineer - Automated Parallel Programming

NVIDIA

Santa Clara, California, United States (Remote)
4 Months ago
ByteDance - Research Scientist Graduate (Foundation Model, Video Generation) - 2025 Start (PhD)

ByteDance

San Jose, California, United States (On-Site)
6 Months ago
Zoox - Technical Program Manager - Artificial Intelligence

Zoox

Foster City, California, United States (Hybrid)
7 Months ago
Level AI - Backend Engineer - Customer Engineering

Level AI

Noida, Uttar Pradesh, India (Hybrid)
7 Months ago
ByteDance - Student Researcher (Doubao (Seed) - Foundation Model - MultiModal Generative Model)

ByteDance

San Jose, California, United States (On-Site)
1 Month ago
ByteDance - Research Engineer - Multimodal Model

ByteDance

Singapore (On-Site)
6 Months ago
Hedra - Machine Learning Engineer (CUDA)

Hedra

New York, New York, United States (On-Site)
1 Month ago
Google - AI Transformation Manager, Professional Services

Google

Toronto, Ontario, Canada (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

About The Company

Microsoft is a tech giant that develops, licenses, and supports a range of software products, services, and devices.

Vancouver, British Columbia, Canada (On-Site)

Mountain View, California, United States (Hybrid)

Shenzhen, Guangdong Province, China (On-Site)

Noida, Uttar Pradesh, India (On-Site)

Sydney, New South Wales, Australia (Remote)

Redmond, Washington, United States (On-Site)

Paris, Île-de-France, France (On-Site)

View All Jobs

Get notified when new jobs are added by Microsoft

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug