Research Intern - Multimodal AI Research

4 Hours ago • 1 Years + • Artificial Intelligence • $78,600 PA - $154,560 PA

About the job

Job Description

Microsoft's AI Platform team seeks Research Interns for its Multimodal Intelligence (MMI) team. The internship involves cutting-edge research in multimodal AI, focusing on video, image, and document understanding. Responsibilities include collaborating with researchers, presenting findings, and contributing to projects such as video understanding, information retrieval, and key-value extraction. Candidates should possess a PhD background in a relevant field (AI, NLP, CV) and at least one year of hands-on deep learning experience. Familiarity with LLMs/VLMs is a plus. The internship is a 12-week program, with interns paired with mentors and expected to contribute to the team's vibrant research community.
Must have:
  • PhD in relevant field
  • 1+ years deep learning experience
  • NLP/CV/AI background
  • Proficient in Python
  • Collaboration skills
Good to have:
  • LLM/VLM familiarity
  • Publications in top conferences
  • Experience with PyTorch
  • C/C++ proficiency
Perks:
  • Industry leading healthcare
  • Educational resources
  • Discounts on products and services
  • Savings and investments
  • Maternity and paternity leave
  • Generous time away
  • Giving programs
  • Networking opportunities

Overview

Research Internships at Microsoft provide a dynamic environment for research careers with a network of world-class research labs led by globally-recognized scientists and engineers, who pursue innovation in a range of scientific and technical disciplines to help solve complex challenges in diverse fields, including computing, healthcare, economics, and the environment.

The AI Platform team is on a mission to advance the state of the art in AI and deliver on our company’s vision for how intelligent cloud and intelligent edge will shape the next phase of innovation. The team includes top scientists and researchers from across Microsoft who are creating a center of excellence in speech, computer vision, and natural language.

 

Within the AI Platform, the Multi-modal Intelligence team (MMI) mission is to make fundamental contributions to advancing the state-of-the-art in AI technology related to Video, Image, Document, and other multimodality inputs. “Documents”, for example, stand at the intersection between NLP and Vision research. To fully understand a document, one needs to borrow from both language and visual (Layout) elements of the document. We explore both single and multimodality inputs – and their synergy - to conduct research on forward-looking topics such as Video Understanding, Information Retrieval, Key-Value extraction, few-shot Named Entity Recognition (NER), hierarchical layout analysis, and many others. 

 

We are looking for Research Interns to work on cutting edge research in Multimodal AI. We are particularly interested in Research Interns with background in AI, NLP, and/or CV, including topics like Video/image understanding, document layout analysis, chart understanding, multi-page multi-document question answering, novel ways of leveraging LLMs for document/video/image understanding and solving problems inherent to large language models (grounding, retrieval-based generation, etc.). Familiarity with modern LLMs/VLMs is a plus, but not required.  

 

Qualifications

Required Qualifications

  • Currently enrolled in a PhD program in Computer Vision, Natural Language Processing, Deep Learning, Machine Learning, AI, or a related field.
  • At least 1 year of experience in NLP, computer vision, Deep learning, or multimodal research with hands-on deep learning experience.

Other Requirements

  • Research Interns are expected to be physically located in their manager’s Microsoft worksite location for the duration of their internship.
  • In addition to the qualifications below, you’ll need to submit a minimum of two reference letters for this position as well as a cover letter and any relevant work or research samples. After you submit your application, a request for letters may be sent to your list of references on your behalf. Note that reference letters cannot be requested until after you have submitted your application, and furthermore, that they might not be automatically requested for all candidates. You may wish to alert your letter writers in advance, so they will be ready to submit your letter. 

Preferred Qualifications

  • Proficient algorithmic problem solving and software development skills (Python, C/C++, etc.).
  • Experience with open-source tools such as PyTorch, etc.
  • Publication(s) in top-tier conferences or journals in related fields (e.g., ACL, CVPR, ECCV, ICCV, EMNLP, NAACL, NIPS, ICML, ICLR, IJCV, PAMI, etc.). 

The base pay range for this internship is USD $6,550 - $12,880 per month. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $8,480 - $13,920 per month.

 

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: 

Microsoft accepts applications and processes offers for these roles on an ongoing basis.

Responsibilities

Research Interns put inquiry and theory into practice. Alongside fellow doctoral candidates and some of the world’s best researchers, Research Interns learn, collaborate, and network for life. Research Interns not only advance their own careers, but they also contribute to exciting research and development strides. During the 12-week internship, Research Interns are paired with mentors and expected to collaborate with other Research Interns and researchers, present findings, and contribute to the vibrant life of the community. Research internships are available in all areas of research, and are offered year-round, though they typically begin in the summer.

Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.
Industry leading healthcare
Educational resources
Discounts on products and services
Savings and investments
Maternity and paternity leave
Generous time away
Giving programs
Opportunities to network and connect
View Full Job Description
$78.6K - $154.6K/yr (Outscal est.)
$116.6K/yr avg.
Redmond, Washington, United States

Add your resume

80%

Upload your resume, increase your shortlisting chances by 80%

About The Company

Microsoft is a tech giant that develops, licenses, and supports a range of software products, services, and devices.

Bengaluru, Karnataka, India (On-Site)

Hyderabad, Telangana, India (On-Site)

Vancouver, British Columbia, Canada (Remote)

Redmond, Washington, United States (On-Site)

Suzhou, Jiangsu, China (On-Site)

View All Jobs

Get notified when new jobs are added by Microsoft

Similar Jobs

PlayStation Global - Lead Engineer, Mobile SDKs

PlayStation Global, United States (Remote)

Aristocrat Gaming - Senior Engineer - Python - Global Gaming Reporting

Aristocrat Gaming, United States (Hybrid)

Codeway - Prompt Engineer

Codeway, Türkiye (On-Site)

Evernorth Health Services - Software Engineering Advisor [T500-12394]

Evernorth Health Services, India (On-Site)

SAP - Data Scientist

SAP, India (Hybrid)

Microsoft - Gen AI Principal Applied Scientist

Microsoft, United States (On-Site)

Meta - Software Engineer, Machine Learning

Meta, United States (On-Site)

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

RealXP Lab - Industry Mentor, Game Development

RealXP Lab, United States (Remote)

Luxoft - Senior Android Framework Developer

Luxoft, India (On-Site)

Sphere Entertainment Co - Render Wrangler

Sphere Entertainment Co, United States (On-Site)

PortalOne,  Inc  - Unreal Engine Developer

PortalOne, Inc , Norway (On-Site)

Microsoft - Principal Software Engineering Lead

Microsoft, United States (Hybrid)

Token Metrics - Crypto Senior Backend Engineer (Remote)

Token Metrics, Colombia (Remote)

Get notifed when new similar jobs are uploaded

Jobs in Redmond, Washington, United States

Axon - Senior Revenue Accountant (Hybrid)

Axon, United States (Hybrid)

Trek - Service Technician

Trek, United States (On-Site)

ByteDance - Creator Partnerships, Parenting - Lemon8 - New York

ByteDance, United States (On-Site)

Onward Search - Contract Principal Engineer

Onward Search, United States (Remote)

ByteDance - Product Manager

ByteDance, United States (On-Site)

IntelliGenesis LLC - Data Scientist Level 3

IntelliGenesis LLC, United States (On-Site)

Axiom Zen - Game Designer, CryptoKitties

Axiom Zen, United States (Remote)

Get notifed when new similar jobs are uploaded

Artificial Intelligence Jobs

CharacterAI - Software Engineer, Machine Learning Infrastructure

CharacterAI, United States (On-Site)

Zoox - Software Engineer - Perception & Sensing

Zoox, United States (Hybrid)

BigID - Sr Solutions/Presales Engineer - West

BigID, United States (Remote)

Microsoft - Software Engineer

Microsoft, India (On-Site)

Egnyte - Sr Product Manager - AI/ML

Egnyte, India (Remote)

Paypal - Sr Engineering Manager, AI/ML Solutions

Paypal, United States (On-Site)

Novancy One | Digital Talent Recruitment - Expert data scientists/Researcher in Generative AI Ref. 005529

Novancy One | Digital Talent Recruitment, United States (On-Site)

Sumo Logic - Senior Software Engineer II, QE - ML/AI

Sumo Logic, India (On-Site)

The Walt Disney Company - Senior Machine Learning Engineer - Ad Platforms

The Walt Disney Company, United States (On-Site)

Get notifed when new similar jobs are uploaded