Multimodal Large Model Algorithm Intern

Tencent

| Singapore, Singapore (On Site) | Full Time | 1 months ago

Apply Now

Job Summary

This role involves conducting research and development in multimodal large model technologies, focusing on cross-modal alignment and understanding tasks to build industry-leading models. The intern will track state-of-the-art algorithms, participate in the design, training, optimization, and evaluation of these models, and promote their application in business scenarios within Tencent's Technology Engineering Group (TEG). TEG supports the company's technology and operational platforms, R&D management, and data centers, providing comprehensive customer services and leading infrastructure R&D.

Must Have

Master’s degree or higher in Computer Science, Machine Learning, Artificial Intelligence, Applied Mathematics, or related fields.
Solid research background in multimodal understanding (e.g., natural language processing, computer vision, speech understanding/generation).
Familiarity with mainstream models and algorithms such as CLIP, LLaVA, VALL-E.
Proficiency in deep learning frameworks like TensorFlow or PyTorch.
Knowledge of distributed training frameworks (e.g., DeepSpeed, Megatron-LM) and practical experience in multi-node/multi-GPU distributed training.
Strong engineering skills with proficiency in at least one programming language: C/C++, Java, or Python.
Excellent learning ability, technical curiosity, and strong teamwork and communication skills.

Good to Have

Publication record in top-tier conferences (e.g., ICLR, NeurIPS, CVPR, ICCV, ECCV, ACL, EMNLP) is preferred.

Job Description

Business Unit

Technology Engineering Group (TEG) is responsible for supporting the company and its business groups on technology and operational platforms, as well as the construction and operation of R&D management and data centers, TEG provides users with a full range of customer services. As the operator of the largest networking, devices, and data center in Asia,TEG also leads the Tencent Technology Committee in strengthening infrastructure R&D through internal and distributed open source collaboration, constructing new platforms and supporting business innovation.

What the Role Entails

Conduct research and development of multimodal large model technologies, including cross-modal alignment and multimodal understanding tasks, to build industry-leading multimodal large models.
Continuously track state-of-the-art algorithms in multimodal large models, participate in the design, training, optimization, and evaluation of these models, and promote their application in business scenarios.

Who We Look For

Master’s degree or higher in Computer Science, Machine Learning, Artificial Intelligence, Applied Mathematics, or related fields.
Solid research background in multimodal understanding (e.g., natural language processing, computer vision, speech understanding/generation), with familiarity in mainstream models and algorithms such as CLIP, LLaVA, VALL-E, etc..
Proficiency in deep learning frameworks like TensorFlow or PyTorch; knowledge of distributed training frameworks (e.g., DeepSpeed, Megatron-LM) and practical experience in multi-node/multi-GPU distributed training.
Strong engineering skills with proficiency in at least one programming language: C/C++, Java, or Python.
Publication record in top-tier conferences (e.g., ICLR, NeurIPS, CVPR, ICCV, ECCV, ACL, EMNLP) is preferred.
Excellent learning ability, technical curiosity, and strong teamwork and communication skills.

13 Skills Required For This Role

Team Management Communication Cpp Game Texts Networking Pytorch Deep Learning Computer Vision Python Algorithms Tensorflow Java Machine Learning

Similar Jobs

Research Development

Lead Engineer, Senior-Machine Learning Tools

Qualcomm • Hyderabad, Telangana, India (On Site)

Engineer, Staff-Machine Learning-Tools

Qualcomm • Hyderabad, Telangana, India (On Site)

Lead Engineer, Senior-Machine Learning, embedded, C++

Qualcomm • Hyderabad, Telangana, India (On Site)

Engineer, Staff-Machine Learning-Embedded,C++

Qualcomm • Hyderabad, Telangana, India (On Site)

Saama • Hinjewadi, Pune, Maharashtra, India (On Site)

Bosch Group India • Hyderabad, Telangana, India (On Site)

Machine Learning Manager, Maps Search

Apple • Seattle, Washington, United States of America (On Site)

MLE (General Training Infrastructure)

Nousresearch • On Site

Senior Associate, Gen AI, D&A, Advisory

PwC • Chennai, Tamil Nadu, India (On Site)

Senior AI/ML Engineer

appzen • Pune, Maharashtra, India (On Site)

Software Development & Engineering

DevOps Engineer

NVIDIA • Raänana, Israel (On Site)

Technical Systems Engineer - AI Ops

Cisco • Kraków, Poland (On Site)

Solutions Engineer, Splunk

Cisco • Copenhagen, Denmark (Hybrid)

Hardware Modeling Engineer

Cisco • San Jose, California, United States (On Site)

Algorithm Developer – Computer Vision

Applied materials • Rehovot, Israel (On Site)

Applied materials • Chennai, Tamil Nadu, India (On Site)

Java Lead Software Engineer

OpenText • Bangalore, Karnataka, India (On Site)

Lead Solutions Consultant, Observability and Service Management

OpenText • Prague, Czechia (Remote)

Java & AI-Senior Software Engineer

OpenText • Bangalore, Karnataka, India (On Site)

Senior Java Developer

London stock Exchange • London, United Kingdom (On Site)