Sr Staff R&D Engineer

1 Day ago • 5 Years + • Research Development • $201,900 PA - $270,700 PA

Job Summary

Job Description

The Skywalker Sound Development Group is seeking a highly accomplished Sr Staff R&D Engineer (AI/ML) to lead the development of transformative audio intelligence technologies for global media production. This senior-level role is central to advancing our next-generation soundtrack platform, with a focus on speech processing, style transfer, upmixing, source separation, and generative audio synthesis. You will architect, build, and optimize cutting-edge machine learning systems at scale—leveraging foundational models, neural vocoders, latent diffusion models, and advanced retraining workflows. As a core member of our applied R&D team, you will contribute to technical direction, collaborate across product and engineering, and deliver production-ready solutions that integrate seamlessly into creative and operational workflows for elite content creators worldwide.
Must have:
  • Lead the research, design, and implementation of state-of-the-art machine learning algorithms for speech processing, voice transfer, source separation, and upmixing in media post-production environments.
  • Drive the architecture and deployment of scalable model training pipelines using PyTorch and distributed computing frameworks.
  • Develop novel generative audio models, including latent diffusion, flow-based models, variational autoencoders, and neural vocoders, optimized for professional soundtrack production.
  • Own end-to-end model lifecycle management: pretraining, fine-tuning, validation, inference optimization, and CI/CD integration.
  • Guide the development of personalized model adaptation workflows to support per-user tuning, cross-project continuity, and flexible deployment.
  • Collaborate with product, platform, and engineering leads to define integration strategies within a secure, cloud-optimized SaaS environment.
  • Stay at the forefront of generative audio, multi-modal modeling, and self-supervised learning—translating emerging research into applied innovation.
  • Contribute to internal tooling and infrastructure that improves iteration speed, reproducibility, and explainability of deployed models.
  • Mentor junior researchers and engineers, and contribute to a culture of rigorous experimentation, collaboration, and continuous improvement.
Good to have:
  • Familiarity with audio/text/video multi-modal frameworks and cross-domain representations.
  • Experience implementing real-time or near-real-time inference pipelines in cloud or edge environments (e.g., AWS, GCP, on-prem GPUs).
  • Working knowledge of latent diffusion audio models (e.g., stable-audio, AudioLDM, AudioGen).
  • Strong knowledge of industry-standard audio datasets and benchmarks (LibriSpeech, VCTK, MUSDB, etc.).
  • Experience optimizing inference pipelines for creative applications or interactive use.
  • Proficiency in lower-level audio frameworks (C / C++, etc.)
  • Contributions to published research at top-tier conferences (NeurIPS, ICASSP, ICLR, Interspeech) and/or open-source ML frameworks.

Job Details

What You’ll Do

  • Lead the research, design, and implementation of state-of-the-art machine learning algorithms for speech processing, voice transfer, source separation, and upmixing in media post-production environments.
  • Drive the architecture and deployment of scalable model training pipelines using PyTorch and distributed computing frameworks.
  • Develop novel generative audio models, including latent diffusion, flow-based models, variational autoencoders, and neural vocoders, optimized for professional soundtrack production.
  • Own end-to-end model lifecycle management: pretraining, fine-tuning, validation, inference optimization, and CI/CD integration.
  • Guide the development of personalized model adaptation workflows to support per-user tuning, cross-project continuity, and flexible deployment.
  • Collaborate with product, platform, and engineering leads to define integration strategies within a secure, cloud-optimized SaaS environment.
  • Stay at the forefront of generative audio, multi-modal modeling, and self-supervised learning—translating emerging research into applied innovation.
  • Contribute to internal tooling and infrastructure that improves iteration speed, reproducibility, and explainability of deployed models.
  • Mentor junior researchers and engineers, and contribute to a culture of rigorous experimentation, collaboration, and continuous improvement.

What We’re Looking For

  • MSc or PhD in Computer Science, Electrical Engineering, Applied Math, or a related field with a focus on AI/ML and mult-imodal signal processing.
  • 5 years of professional experience in applied ML, with a deep focus on audio-centric AI/ML research and deployment.
  • Expertise in building and scaling models using PyTorch, with fluency in training, fine-tuning, and inference for deep neural networks.
  • Demonstrated experience developing generative models such as VAE, GAN, diffusion models, or neural vocoders (e.g., HiFi-GAN, WaveNet).
  • Deep understanding of audio-specific ML domains, including source separation, speech enhancement, music processing, and cross-modal tasks.
  • Experience with MLOps tooling (e.g., Weights & Biases, MLflow, Datachain), Docker-based containerization, and scalable infrastructure for distributed training.
  • Fluency in audio signal processing fundamentals and the integration of DSP into ML pipelines.
  • Proven ability to contribute to architectural planning, research strategy, and production deployment in complex, multi-stakeholder environments.

Preferred Qualifications

  • Familiarity with audio/text/video multi-modal frameworks and cross-domain representations.
  • Experience implementing real-time or near-real-time inference pipelines in cloud or edge environments (e.g., AWS, GCP, on-prem GPUs).
  • Working knowledge of latent diffusion audio models (e.g., stable-audio, AudioLDM, AudioGen).
  • Strong knowledge of industry-standard audio datasets and benchmarks (LibriSpeech, VCTK, MUSDB, etc.).
  • Experience optimizing inference pipelines for creative applications or interactive use.
  • Proficiency in lower-level audio frameworks (C / C++, etc.)
  • Contributions to published research at top-tier conferences (NeurIPS, ICASSP, ICLR, Interspeech) and/or open-source ML frameworks.

Similar Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Similar Skill Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Jobs in Nicasio, California, United States

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Research Development Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

About The Company

London, England, United Kingdom (Hybrid)

London, England, United Kingdom (Hybrid)

Nicasio, California, United States (Hybrid)

San Francisco, California, United States (On-Site)

Vancouver, British Columbia, Canada (Hybrid)

London, England, United Kingdom (Hybrid)

Nicasio, California, United States (On-Site)

San Francisco, California, United States (Hybrid)

Vancouver, British Columbia, Canada (Hybrid)

San Francisco, California, United States (On-Site)

View All Jobs

Get notified when new jobs are added by lucas films

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug