Research Scientist / Engineer – Pre-training / Scaling

Luma

Job Summary

The Pre-Training / Scaling team at Luma is responsible for developing core multimodal AI systems that power the entire platform. This involves leading cutting-edge research in multimodal foundation models across video, image, text, and audio, designing novel algorithms and architectures for large-scale generative AI models, and developing training methodologies for these models across thousands of GPUs. The role also includes researching and implementing state-of-the-art techniques in Autoregressive LLMs, Vision Language Models, and Diffusion Models, and collaborating with cross-functional teams to transition research into production.

Must Have

  • Lead cutting-edge research in multimodal foundation models spanning video, image, text, and audio
  • Design and implement novel algorithms, architectures, and techniques for large-scale generative AI models
  • Develop training methodologies for foundation models across thousands of GPUs
  • Research and implement state-of-the-art techniques in Autoregressive LLMs, Vision Language Models, and / or Diffusion Models
  • Collaborate with cross-functional teams to transition research into production systems
  • Expertise in Python and PyTorch with experience building ML models from scratch
  • Deep understanding of multimodal generative models and deep learning architectures

Good to Have

  • Strong research track record in generative AI with published work in top-tier venues
  • Experience with large-scale distributed training systems

Job Description

About the Role

At Luma, the Pre-Training / Scaling team is responsible for building the core multimodal AI systems that power our entire platform. Working at the forefront of generative AI research, this team develops the fundamental architectures and training methodologies that enable our models to see, hear, understand, and interact with the world across video, image, text, and audio modalities.

Responsibilities

  • Lead cutting-edge research in multimodal foundation models spanning video, image, text, and audio
  • Design and implement novel algorithms, architectures, and techniques for large-scale generative AI models
  • Develop training methodologies for foundation models across thousands of GPUs
  • Research and implement state-of-the-art techniques in Autoregressive LLMs, Vision Language Models, and / or Diffusion Models.
  • Collaborate with cross-functional teams to transition research into production systems

Experience

  • Expertise in Python and PyTorch with experience building ML models from scratch
  • Deep understanding of multimodal generative models and deep learning architectures
  • (Preferred) Strong research track record in generative AI with published work in top-tier venues preferred
  • (Preferred) Experience with large-scale distributed training systems

Your application are reviewed by real people.

Compensation

------------

The base pay range for this role is $187,500 – $395,000 per year.

6 Skills Required For This Role

Cross Functional Game Texts Pytorch Deep Learning Python Algorithms

Similar Jobs