Gen AI Audio Researcher

2 Days ago • All levels

Job Summary

Job Description

Brahma, a pioneering enterprise AI company, is seeking a Gen AI Researcher for Audio to develop next-generation voice synthesis models. The role involves researching and building deep learning systems for generating expressive, natural-sounding speech from text or audio prompts. The researcher will collaborate with cross-functional teams to integrate their work into production-ready pipelines. Key responsibilities include researching and developing state-of-the-art voice synthesis models, building and fine-tuning models, designing training pipelines and datasets, and exploring techniques for emotional expressiveness and multilingual synthesis. The role requires staying updated with academic and industrial trends.
Must have:
  • Strong background in machine learning and deep learning.
  • Hands-on experience with TTS or voice cloning.
  • Proficiency with Python and PyTorch.
  • Experience training models at scale with large audio datasets.
  • Familiarity with vocoders and transformer-based architectures.
  • Strong problem-solving skills in a remote-first environment.
Good to have:
  • PhD degree in Computer Science/ Machine Learning.
  • Contributions to open-source speech research.
  • Familiarity with lip-syncing or audio-driven animation.
  • Experience with voice datasets or proprietary pipelines.

Job Details

Brahma is a pioneering enterprise AI company developing Astras, AI-native products built to help enterprises and creators innovate at scale. Brahma enables teams to break creative bottlenecks, accelerate storytelling, and deliver standout content with speed and efficiency. Part of the DNEG Group, Brahma brings together Hollywood’s leading creative technologists, innovators in AI and Generative AI, and thought leaders in the ethical creation of AI content.
 
We are looking for a Gen AI Researcher for Audio to join our team and help develop next-generation voice synthesis models. You'll research and build deep learning systems that can generate expressive, natural-sounding speech from text or audio prompts, and collaborate with cross-functional teams to integrate your work into production-ready pipelines.

Key Responsibilities

  • Research and develop state-of-the-art voice synthesis models (e.g., TTS, voice cloning, speech-to-speech).
  • Build and fine-tune models using frameworks like PyTorch and HuggingFace.
  • Design training pipelines and datasets for scalable voice model training.
  • Explore techniques for emotional expressiveness, multilingual synthesis, and speaker adaptation.
  • Work closely with product and creative teams to ensure models meet quality and production constraints.
  • Stay on top of academic and industrial trends in speech synthesis and related fields.
 
Must Haves
  • Strong background in machine learning and deep learning, with focus on speech/audio.
  • Hands-on experience with TTS, voice cloning, or related voice synthesis tasks.
  • Proficiency with Python and PyTorch; experience with libraries like torchaudio, ESPnet, or similar.
  • Experience training models at scale and working with large audio datasets.
  • Familiarity with vocoders and transformer-based architectures.
  • Strong problem-solving skills, ability to work autonomously in a remote-first environment.
 
Nice to Have
  • PhD degree in Computer Science/ Machine Learning and publications in top venues.
  • Contributions to open-source speech research or participation in relevant benchmarks.
  • Familiarity with adjacent areas like lip-syncing, audio-driven animation, or expressive speech control.
  • Experience with voice datasets or proprietary pipelines.
 
About Us
We are DNEG, one of the world’s leading visual effects and animation companies for the creation of award-winning feature film, television, and multiplatform content. We employ more than 9,000 people with worldwide offices and studios across North America (Los Angeles, Montréal, Toronto, Vancouver), Europe (London), Asia (Bangalore, Mohali, Chennai, Mumbai) and Australia (Sydney). At DNEG, we fundamentally believe that embracing our differences is a vital component of our collective success. We are committed to creating an equitable, diverse and inclusive work environment for our global teams, where everyone feels they matter and belong. We welcome and encourage applications from all, regardless of background, experience or disability. Please let us know if you need any adjustments or support during the application process, we will do our best to accommodate your needs. We look forward to meeting you!

Similar Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Similar Skill Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Jobs in London, England, United Kingdom

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Similar Category Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

About The Company

London, England, United Kingdom (On-Site)

London, England, United Kingdom (On-Site)

London, England, United Kingdom (Remote)

London, England, United Kingdom (Hybrid)

Mumbai, Maharashtra, India (On-Site)

London, England, United Kingdom (On-Site)

Mumbai, Maharashtra, India (On-Site)

London, England, United Kingdom (Hybrid)

Montréal, Québec, Canada (Hybrid)

Warsaw, Masovian Voivodeship, Poland (On-Site)

View All Jobs

Get notified when new jobs are added by Double Negative Visual Effects

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug