Research Scientist / Engineer – Pre-training / Scaling

Luma

| Remote | Full Time | 1 day ago

Apply Now

Job Summary

The Pre-Training / Scaling team at Luma is responsible for developing core multimodal AI systems that power the entire platform. This involves leading cutting-edge research in multimodal foundation models across video, image, text, and audio, designing novel algorithms and architectures for large-scale generative AI models, and developing training methodologies for these models across thousands of GPUs. The role also includes researching and implementing state-of-the-art techniques in Autoregressive LLMs, Vision Language Models, and Diffusion Models, and collaborating with cross-functional teams to transition research into production.

Must Have

Lead cutting-edge research in multimodal foundation models spanning video, image, text, and audio
Design and implement novel algorithms, architectures, and techniques for large-scale generative AI models
Develop training methodologies for foundation models across thousands of GPUs
Research and implement state-of-the-art techniques in Autoregressive LLMs, Vision Language Models, and / or Diffusion Models
Collaborate with cross-functional teams to transition research into production systems
Expertise in Python and PyTorch with experience building ML models from scratch
Deep understanding of multimodal generative models and deep learning architectures

Good to Have

Strong research track record in generative AI with published work in top-tier venues
Experience with large-scale distributed training systems

Job Description

About the Role

At Luma, the Pre-Training / Scaling team is responsible for building the core multimodal AI systems that power our entire platform. Working at the forefront of generative AI research, this team develops the fundamental architectures and training methodologies that enable our models to see, hear, understand, and interact with the world across video, image, text, and audio modalities.

Responsibilities

Lead cutting-edge research in multimodal foundation models spanning video, image, text, and audio
Design and implement novel algorithms, architectures, and techniques for large-scale generative AI models
Develop training methodologies for foundation models across thousands of GPUs
Research and implement state-of-the-art techniques in Autoregressive LLMs, Vision Language Models, and / or Diffusion Models.
Collaborate with cross-functional teams to transition research into production systems

Experience

Expertise in Python and PyTorch with experience building ML models from scratch
Deep understanding of multimodal generative models and deep learning architectures
(Preferred) Strong research track record in generative AI with published work in top-tier venues preferred
(Preferred) Experience with large-scale distributed training systems

Your application are reviewed by real people.

Compensation

------------

The base pay range for this role is $187,500 – $395,000 per year.

6 Skills Required For This Role

Cross Functional Game Texts Pytorch Deep Learning Python Algorithms