AI Senior Research Scientist
Tencent
Job Summary
The AI Senior Research Scientist will research and optimize content generation models (text, image, audio, 3D models) to improve quality, diversity, controllability, and efficiency. The role involves algorithm training and optimization in areas like image generation, multi-modal large models, and few-shot learning, aiming to enhance user experience and support algorithm productization. Responsibilities include prompt optimization, generation model R&D, adapter development, and performance acceleration for AI painting, text, and video generation, addressing industrial deployment challenges of multimodal generative models.
Must Have
- Research and optimize content generation models (text, image, audio, 3D).
- Address challenges in generation quality, diversity, controllability, and efficiency.
- Conduct algorithm training for image generation, multi-modal large models, few-shot learning.
- Improve AI painting, text, and video generation performance.
- Resolve algorithm bottlenecks when applying models in real business scenarios.
- Address industrial deployment of multimodal generative models.
- PhD in Computer Science, Artificial Intelligence, Mathematics, or related fields.
- Solid foundation in computer vision or machine learning algorithms.
- Proficient in machine learning and deep learning fundamentals.
- Familiar with mainstream AIGC frameworks (GAN, VAE, VQGAN, Diffusion models).
- Familiar with generation model extensions (ControlNet, LoRA, Text Inversion).
- Familiar with multi-modal models (CLIP, ERNIE-ViL, transformer-based).
- Fluency in both English and Mandarin.
Good to Have
- Publications in top conferences or journals.
- Hands-on experience in NLP, multi-modal learning, or AI-generated content.
- Strong learning ability, clear logical thinking, excellent communication skills.
- High level of curiosity.
- Good teamwork and interpersonal communication skills.
Job Description
Business Unit
Technology Engineering Group (TEG) is responsible for supporting the company and its business groups on technology and operational platforms, as well as the construction and operation of R&D management and data centers, TEG provides users with a full range of customer services. As the operator of the largest networking, devices, and data center in Asia,TEG also leads the Tencent Technology Committee in strengthening infrastructure R&D through internal and distributed open source collaboration, constructing new platforms and supporting business innovation.
What the Role Entails
- Research and optimize content generation models (text, image, audio, 3D models, etc.) to address challenges such as generation quality, diversity, controllability, and efficiency.
- Aim to improve user experience and production effectiveness, and support the productization of algorithms.
- Conduct algorithm training and optimization in areas such as image generation, multi-modal large models, and few-shot learning.
- Based on inhouse products and business needs, improve the performance and experience of AI painting, text generation, and video generation through: Prompt optimization / Generation model R&D / Adapter development / Performance acceleration which also includes resolving algorithm bottlenecks when applying models in real business scenarios.
- Address the industrial deployment of multimodal generative models and actively explore model design and optimization in an R&D context
Who We Look For
- PhD (preferably fulltime) in Computer Science, Artificial Intelligence, Mathematics, or related fields.
- Solid foundation in computer vision or machine learning algorithms; candidates with publications in top conferences or journals are preferred.
- Proficient in machine learning and deep learning fundamentals, and familiar with mainstream AIGC frameworks, including GAN, VAE, VQGAN, Diffusion models, etc.
- Familiar with generation model extensions such as ControlNet, LoRA, and Text Inversion.
- Familiar with multi-modal models like CLIP, ERNIE-ViL, and other transformer-based cross-modal representation models. Hands-on experience in NLP, multi-modal learning, or AI-generated content is a strong plus.
- Strong learning ability, clear logical thinking, excellent communication skills, and a high level of curiosity.
- Good teamwork and interpersonal communication skills.
- Fluency in both English and Mandarin to deal with international stakeholders and stakeholders who are based in HQ