Research Scientist / Engineer – Pre-training / Scaling
Luma
Job Summary
The Pre-Training / Scaling team at Luma is responsible for developing core multimodal AI systems that power the entire platform. This involves leading cutting-edge research in multimodal foundation models across video, image, text, and audio, designing novel algorithms and architectures for large-scale generative AI models, and developing training methodologies for these models across thousands of GPUs. The role also includes researching and implementing state-of-the-art techniques in Autoregressive LLMs, Vision Language Models, and Diffusion Models, and collaborating with cross-functional teams to transition research into production.
Must Have
- Lead cutting-edge research in multimodal foundation models spanning video, image, text, and audio
- Design and implement novel algorithms, architectures, and techniques for large-scale generative AI models
- Develop training methodologies for foundation models across thousands of GPUs
- Research and implement state-of-the-art techniques in Autoregressive LLMs, Vision Language Models, and / or Diffusion Models
- Collaborate with cross-functional teams to transition research into production systems
- Expertise in Python and PyTorch with experience building ML models from scratch
- Deep understanding of multimodal generative models and deep learning architectures
Good to Have
- Strong research track record in generative AI with published work in top-tier venues
- Experience with large-scale distributed training systems
Job Description
About the Role
At Luma, the Pre-Training / Scaling team is responsible for building the core multimodal AI systems that power our entire platform. Working at the forefront of generative AI research, this team develops the fundamental architectures and training methodologies that enable our models to see, hear, understand, and interact with the world across video, image, text, and audio modalities.
Responsibilities
- Lead cutting-edge research in multimodal foundation models spanning video, image, text, and audio
- Design and implement novel algorithms, architectures, and techniques for large-scale generative AI models
- Develop training methodologies for foundation models across thousands of GPUs
- Research and implement state-of-the-art techniques in Autoregressive LLMs, Vision Language Models, and / or Diffusion Models.
- Collaborate with cross-functional teams to transition research into production systems
Experience
- Expertise in Python and PyTorch with experience building ML models from scratch
- Deep understanding of multimodal generative models and deep learning architectures
- (Preferred) Strong research track record in generative AI with published work in top-tier venues preferred
- (Preferred) Experience with large-scale distributed training systems
Your application are reviewed by real people.
Compensation
------------
The base pay range for this role is $187,500 – $395,000 per year.