Engineering Manager, Machine Learning

1 Month ago • All levels • Research Development • $200,000 PA - $300,000 PA

Job Summary

Job Description

Captions, a leading AI video company, is seeking an ML Engineering Lead to head a small, impactful team focused on deploying large-scale multimodal video diffusion models. Responsibilities include driving the technical vision for production deployment, overseeing ML pipelines for GPU-based inference and model optimization, adapting state-of-the-art generative models, and building/maintaining serving infrastructure for low-latency, high-throughput inference. The role also involves model optimization through techniques like quantization and pruning, implementing fine-tuning workflows, and managing production MLOps, performance, and scaling using CI/CD pipelines and monitoring tools. Experience with cutting-edge AI in audio-video generation and diffusion architectures is key to ensuring these innovations reach millions of creators.
Must have:
  • Deploy deep learning models on GPU infrastructure (NVIDIA GPUs, CUDA, TensorRT)
  • Containerization (Docker, Kubernetes) and microservice architectures for ML serving
  • Proficiency in Python and PyTorch/TensorFlow
  • Experience with compression techniques for large models (quantization, pruning)
  • Profiling and optimizing model inference (batching, concurrency, hardware utilization)
  • ML pipeline orchestration (Airflow, Kubeflow, Argo) and CI/CD for ML
  • Logging, monitoring, and alerting tools (Prometheus, Grafana)
  • Familiarity with diffusion models or large-scale generative architectures
Good to have:
  • Experience with distributed training frameworks (FSDP, DeepSpeed, Megatron-LM) or HPC environments
Perks:
  • Comprehensive medical, dental, and vision plans
  • 401K with employer match
  • Commuter Benefits
  • Catered lunch multiple days per week
  • Dinner stipend every night if working late
  • Doordash DashPass subscription
  • Health & Wellness Perks (Talkspace, Kindbody, One Medical subscription, HealthAdvocate, Teladoc)
  • Multiple team offsites per year with monthly team events
  • Generous PTO policy

Job Details

Captions is the leading AI video company—our mission is to empower anyone, anywhere to tell their stories through video. Over 10 million creators and businesses have used Captions to simplify video creation with truly novel and groundbreaking AI capabilities.

We are a rapidly growing team of ambitious, experienced, and devoted engineers, researchers, designers, marketers, and operators based in NYC. As an early member of our team, you’ll have an opportunity to have an outsized impact on our products and our company's culture.

Our Technology

Mirage Announcement our proprietary omni-modal foundation model

Seeing Voices (technical paper) generating A-roll video from audio with Mirage

Mirage Studio for generating expressive videos at scale

"Captions: For Talking Videos” available in the iOS app store

Press Coverage

Lenny’s Podcast: Interview with Gaurav Misra (CEO)

Latest Fundraise: Series C Announcement

The Information: 50 Most Promising Startups

Fast Company: Next Big Things in Tech

Business Insider: 34 most promising AI startups

TIME: The Best Inventions of 2024

Our Investors

We’re very fortunate to have some the best investors and entrepreneurs backing us, including Index Ventures, Kleiner Perkins, Sequoia Capital, Andreessen Horowitz, Uncommon Projects, Kevin Systrom, Mike Krieger, Lenny Rachitsky, Antoine Martin, Julie Zhuo, Ben Rubin, Jaren Glover, SVAngel, 20VC, Ludlow Ventures, Chapter One, and more.

** Please note that all of our roles will require you to be in-person at our NYC HQ (located in Union Square)

We do not work with third-party recruiting agencies, please do not contact us**

About the Role:
Captions is seeking an ML Engineering Lead to lead a small, high-impact team of ML engineers that bring large-scale multimodal video diffusion models into production. ML engineering is responsible for optimizing and deploying state-of-the-art generative models (tens to hundreds of billions of parameters) to deliver low-latency, high-throughput inference at scale. This is a unique opportunity to work on cutting-edge AI—spanning audio-video generation, diffusion architectures, and temporal modeling—and ensure these innovations reach millions of creators worldwide.

Responsibilities:

  • Technical Leadership

    • Drive the technical vision for deploying large-scale multimodal diffusion models (tens to hundreds of billions of parameters) in production.

    • Oversee and contribute to core ML pipelines—from GPU-based inference to model optimization.

      Collaborate with researchers to adapt state-of-the-art generative models for real-world performance and reliability.

  • Inference & Deployment

    • Develop high-performance GPU-based inference pipelines for large multimodal diffusion models.

    • Build, optimize, and maintain serving infrastructure to deliver low-latency predictions at large scale.

    • Collaborate with software engineering teams to containerize models, manage autoscaling, and ensure uptime SLAs.

  • Model Optimization & Fine-Tuning

    • Leverage techniques like quantization, pruning, and distillation to reduce latency and memory footprint without compromising quality.

    • Implement continuous fine-tuning workflows to adapt models based on real-world data and feedback.

  • Production MLOps, Performance, Scaling

    • Design and maintain automated CI/CD pipelines for model deployment, versioning, and rollback.

    • Implement robust monitoring (latency, throughput, concept drift) and alerting for critical production systems.

    • Explore cutting-edge GPU acceleration frameworks (e.g., TensorRT, Triton, TorchServe) to continuously improve throughput and reduce costs.

Requirements:

  • Technical Expertise

    • Proven experience deploying deep learning models on GPU-based infrastructure (NVIDIA GPUs, CUDA, TensorRT, etc.).

    • Strong knowledge of containerization (Docker, Kubernetes) and microservice architectures for ML model serving.

    • Proficiency with Python and at least one deep learning framework (PyTorch, TensorFlow).

  • Model Optimization

    • Familiarity with compression techniques (quantization, pruning, distillation) for large-scale models.

    • Experience profiling and optimizing model inference (batching, concurrency, hardware utilization).

  • Infrastructure

    • Hands-on experience with ML pipeline orchestration (Airflow, Kubeflow, Argo) and automated CI/CD for ML.

    • Strong grasp of logging, monitoring, and alerting tools (Prometheus, Grafana, etc.) in distributed systems.

  • Domain Experience

    • Exposure to diffusion models, multimodal video generation, or large-scale generative architectures.

    • Experience with distributed training frameworks (FSDP, DeepSpeed, Megatron-LM) or HPC environments.

Benefits:

  • Comprehensive medical, dental, and vision plans

  • 401K with employer match

  • Commuter Benefits

  • Catered lunch multiple days per week

  • Dinner stipend every night if you're working late and want a bite!

  • Doordash DashPass subscription

  • Health & Wellness Perks (Talkspace, Kindbody, One Medical subscription, HealthAdvocate, Teladoc)

  • Multiple team offsites per year with team events every month

  • Generous PTO policy

Captions provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.

Please note benefits apply to full time employees only.

Similar Jobs

Mistral AI - Software Engineer - Research Team

Mistral AI

Paris, Île-de-France, France (Hybrid)
3 Weeks ago
NVIDIA - Senior Software Engineer, PyTorch - Deep Learning

NVIDIA

Santa Clara, California, United States (Hybrid)
4 Months ago
bytedance - Software Engineer, Architecture and Infrastructure

bytedance

San Jose, California, United States (On-Site)
9 Months ago
Qualcomm - AI SDK Software Engineer

Qualcomm

Shanghai, China (On-Site)
2 Months ago
EveryMatrix - LLM Algorithm Engineer

EveryMatrix

Changsha, Hunan, China (On-Site)
3 Months ago
WebFX - AI Digital Marketing Specialist

WebFX

United States (Remote)
2 Months ago
Eventbrite - Researcher II (East Coast)

Eventbrite

United States (Remote)
1 Month ago
Riot Games - Generative AI Research Intern

Riot Games

Singapore (On-Site)
2 Months ago
London stock Exchange - Lead Research Analyst (Vietnamese language)

London stock Exchange

Penang, Malaysia (On-Site)
1 Month ago
Rackspace Technology - Senior Machine Learning Engineer

Rackspace Technology

Vietnam (Remote)
4 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Qualcomm - AI SDK Software Engineer

Qualcomm

Chengdu, Sichuan, China (On-Site)
2 Months ago
Shield AI - Staff Engineer, Reinforcement Learning (R3639)

Shield AI

San Diego, California, United States (On-Site)
1 Week ago
NVIDIA - Technical Marketing Manager

NVIDIA

Taipei City, Taiwan (On-Site)
6 Months ago
NVIDIA - Senior Deep Learning Research Engineer, Advanced AI Systems

NVIDIA

Santa Clara, California, United States (On-Site)
4 Months ago
bytedance - High-Performance Computing Research Scientist (Inference Optimization)

bytedance

San Jose, California, United States (On-Site)
4 Months ago
bytedance - Senior Software Engineer - Serverless Compute Infrastructure

bytedance

Seattle, Washington, United States (On-Site)
6 Months ago
NVIDIA - Developer Technology Engineer - HPC and AI

NVIDIA

Taipei City, Taiwan (On-Site)
3 Months ago
Captions - Member of Technical Staff, Image Generation

Captions

New York, New York, United States (On-Site)
1 Month ago
NVIDIA - Solutions Architect, Data Science

NVIDIA

Taipei City, Taiwan (On-Site)
6 Months ago
Philips - CT Principal Software Architect

Philips

Shenyang, Liaoning, China (On-Site)
3 Months ago

Get notifed when new similar jobs are uploaded

Jobs in New York, United States

Glean - Technical Marketing Engineer (Departmental Solutions)

Glean

Palo Alto, California, United States (On-Site)
2 Months ago
Roblox - Senior Software Engineer, Shopping

Roblox

San Mateo, California, United States (On-Site)
2 Weeks ago
NXP - Semiconductor Maintenance Technician

NXP

Chandler, Arizona, United States (On-Site)
1 Month ago
Saronic Technologies - Software Engineer - Special Programs

Saronic Technologies

Austin, Texas, United States (On-Site)
1 Week ago
Apple - Community Specialist, Channel Retail

Apple

Dallas, Texas, United States (On-Site)
1 Month ago
Lilt - Korean US-based Medical Translators needed

Lilt

United States (Remote)
3 Months ago
Coupa - Sr. Customer Value Director - 10404

Coupa

Philadelphia, Pennsylvania, United States (Remote)
2 Weeks ago
CharacterAI - Platform Engineer, Frontend

CharacterAI

Palo Alto, California, United States (On-Site)
3 Months ago
Hero Marketing Agency - Animator

Hero Marketing Agency

Austin, Texas, United States (On-Site)
1 Year ago
zoox - Senior Manager, Supply Chain Operations

zoox

Foster City, California, United States (Hybrid)
1 Year ago

Get notifed when new similar jobs are uploaded

Research Development Jobs

bytedance - Research Scientist - Multimodal Foundation Model - 2025 Start

bytedance

Singapore (On-Site)
9 Months ago
Balbix - AI/ML Architect

Balbix

Bengaluru, Karnataka, India (On-Site)
9 Months ago
Pinterest - Manager II, Machine Learning Engineering, Core Engineering

Pinterest

San Francisco, California, United States (On-Site)
3 Weeks ago
Moonton  - Senior R&D PM (Co-development)

Moonton

Shanghai, China (On-Site)
2 Weeks ago
Perplexity - AI Machine Learning Engineer - Query Understanding

Perplexity

San Francisco, California, United States (Hybrid)
2 Months ago
Unity - Senior Machine Learning/MLOps Developer

Unity

Montreal, Quebec, Canada (On-Site)
10 Months ago
Ansys - R&D Engineer

Ansys

Seoul, South Korea (On-Site)
1 Month ago
bytedance - Research Scientist in Foundation Models for Science - ByteDance Research

bytedance

San Jose, California, United States (On-Site)
9 Months ago
Dave Ramsey - Marketing Researcher

Dave Ramsey

Franklin, Tennessee, United States (On-Site)
3 Weeks ago
JDA - Principal Software Engineer (Gen AI)

JDA

Dallas, Texas, United States (Hybrid)
2 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Captions is the leading video AI company, building the future of video creation. Over 10 million creators and businesses have used Captions to create videos for social media, marketing, sales, and more. We're on a mission to serve the next billion. We are a rapidly growing team of ambitious, experienced, and devoted engineers, researchers, designers, marketers, and operators based in NYC. You'll join an early team and have an outsized impact on the product and the company's culture.

New York, New York, United States (On-Site)

New York, New York, United States (On-Site)

New York, New York, United States (On-Site)

New York, United States (On-Site)

New York, United States (On-Site)

New York, New York, United States (On-Site)

New York, New York, United States (On-Site)

New York, United States (On-Site)

New York, United States (On-Site)

View All Jobs

Get notified when new jobs are added by Captions

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug