Machine Learning Engineer

1 Month ago • All levels • DevOps

Job Summary

Job Description

Hedra seeks an ML Engineer expert in high-performance computing to manage and optimize the computational infrastructure for training and deploying machine learning models. Responsibilities include designing scalable computing solutions for training and deploying ML models handling large video datasets, managing and optimizing computing clusters (AWS/Google Cloud), ensuring infrastructure handles resource-intensive tasks associated with training large generative models, monitoring system performance and implementing improvements (using Kubeflow), and collaborating with the team to understand computational needs and provide solutions. The role focuses on deploying and scaling video generation models using 3DVAE and video diffusion models.
Must have:
  • Experience with cloud platforms (AWS, GCP, Azure)
  • Knowledge of Docker, Kubeflow
  • Understanding of distributed training
  • Proficiency in Python or Bash
  • System administration background
  • Scalable solutions for ML model training and deployment
Perks:
  • Competitive compensation and equity
  • 401k
  • Healthcare (Silver PPO Medical, Vision, Dental)
  • Lunch and snacks at the office

Job Details

Hedra is a pioneering generative media company backed by top investors at Index, A16Z, and Abstract Ventures. We're building Hedra Studio, a multimodal creation platform capable of control, emotion, and creative intelligence.

At the core of Hedra Studio is our Character-3 foundation model, the first omnimodal model in production. Character-3 jointly reasons across image, text, and audio for more intelligent video generation — it’s the next evolution of AI-driven content creation.

Note: At Hedra, we’re a team of hard-working, passionate individuals seeking to fundamentally change content and build a generational company together. You should have start-up experience and be a self-starter that is driven to build impactful products that change the status quo. You must be willing to work in-person in either NYC or SF.

Overview:

We are looking for an ML Engineer with expertise in high-performance computing systems to manage and optimize our computational infrastructure for training and deploying our machine learning models. The ideal candidate will have experience with cloud computing platforms and tools for managing ML workloads at scale, supporting our 3DVAE and video diffusion models.

Responsibilities:

  • Design and implement scalable computing solutions for training and deploying ML models, ensuring infrastructure can handle large video datasets.

  • Manage and optimize the performance of our computing clusters or cloud instances, such as AWS or Google Cloud, to support distributed training.

  • Ensure that our infrastructure can handle the resource-intensive tasks associated with training large generative models.

  • Monitor system performance and implement improvements to maximize efficiency, using tools like Kubeflow for orchestration.

  • Collaborate with the team to understand their computational needs and provide appropriate solutions, facilitating seamless model deployment.

Qualifications:

  • Bachelor’s degree in Computer Science, Information Technology, or a related field, with a focus on system administration.

  • Experience with cloud computing platforms such as Amazon Web Services, Google Cloud, or Microsoft Azure, essential for managing large-scale ML workloads.

  • Knowledge of containerization tools like Dockerfile and orchestration tools like Kubeflow, crucial for deploying models at scale.

  • Understanding of distributed training techniques and how to scale models across multiple GPUs or machines, aligning with video generation needs.

  • Proficiency in scripting languages like Python or Bash for automation tasks, facilitating infrastructure management.

  • Strong problem-solving and communication skills, given the need to collaborate with diverse teams.

This role is vital for ensuring the computational backbone supports the company’s ML efforts, focusing on deployment and scalability.

Benefits:

  • Competitive compensation and equity

  • 401k (no match)

  • Healthcare (Silver PPO Medical, Vision, Dental)

  • Lunch and snacks at the office

We encourage you to apply even if you don't fully meet all the listed requirements; we value potential and diverse perspectives, and your unique skills could be a great asset to our team.

Similar Jobs

Logifuture - Senior DevOps Engineer

Logifuture

Vojvodina, Serbia (Remote)
3 Weeks ago
Immutable - Enterprise Technology Engineer

Immutable

Sydney, New South Wales, Australia (Hybrid)
4 Months ago
Zeta - Engineering Manager - Cloud Security (DevSecOps)

Zeta

Bengaluru, Karnataka, India (On-Site)
5 Months ago
Brillio - DB Migration Engineer - R01531207

Brillio

Bengaluru, Karnataka, India (Hybrid)
5 Months ago
Bluevine - Senior QA Automation Engineer

Bluevine

Bengaluru, Karnataka, India (Hybrid)
6 Months ago
Omnissa - Staff Engineer (C++ Windows Internals)

Omnissa

Bengaluru, Karnataka, India (On-Site)
6 Months ago
Scopely - Director, Cloud FinOps

Scopely

Canada (Remote)
2 Months ago
Escape Velocity Entertainment - Site Reliability Engineer

Escape Velocity Entertainment

(Remote)
4 Weeks ago
Rackspace Technology - Lead Azure Data Engineer (Tech Lead)

Rackspace Technology

New York, New York, United States (On-Site)
3 Months ago
Interactive Brokers - Senior Cloud Security Engineer

Interactive Brokers

Fort Lauderdale, Florida, United States (Hybrid)
6 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

NOVOMATIC - QA Engineer (Embedded Systems)

NOVOMATIC

Lesser Poland Voivodeship, Poland (Hybrid)
4 Weeks ago
ByteDance - DevOps Engineer - Applied Machine Learning Engine (Singapore)

ByteDance

Singapore (On-Site)
4 Months ago
Next Level Business Services - Java UI Developer

Next Level Business Services

Tampa, Florida, United States (On-Site)
5 Months ago
Ajmera Infotech - Senior DevOps - Azure Infrastructure + DevOps

Ajmera Infotech

Bengaluru, Karnataka, India (Hybrid)
2 Months ago
Playrix - Senior Release Engineer

Playrix

Montenegro (Remote)
5 Months ago
Revolgy - Senior Cloud Operations Engineer

Revolgy

United Kingdom (Remote)
1 Month ago
Eightfold - Lead Engineer- Backend

Eightfold

Bengaluru, Karnataka, India (Hybrid)
5 Months ago
Rackspace Technology - L2 Support Engineer (Windows/Linux on AWS)

Rackspace Technology

India (Remote)
1 Month ago
Modio - Cloud Systems Engineer

Modio

Victoria, Australia (On-Site)
4 Weeks ago
Trend Micro - (Sr.) Software Engineer in Linux

Trend Micro

Taipei City, Taiwan (On-Site)
6 Months ago

Get notifed when new similar jobs are uploaded

Jobs in San Francisco, California, United States

ZeniMax Media - Build Programmer

ZeniMax Media

Rockville, Maryland, United States (On-Site)
7 Months ago
Daybreak Game Company LLC - Software Development Engineer (Server Developer)

Daybreak Game Company LLC

Renton, Washington, United States (Hybrid)
5 Months ago
Nagarro - Senior Staff Engineer - SAP FICO S/4Hana Solution Advisor

Nagarro

United States (Remote)
5 Months ago
31st Union - Expert UI Engineer

31st Union

San Mateo, California, United States (On-Site)
1 Month ago
Schell Games - Senior Game Engineering Manager

Schell Games

Pittsburgh, Pennsylvania, United States (Hybrid)
9 Months ago
ByteDance - Video Analysis and Quality Algorithm Intern 2023 Summer/Fall (PHD)

ByteDance

San Jose, California, United States (On-Site)
5 Months ago
Samsung Semiconductor - Senior Engineer, Data Scientist

Samsung Semiconductor

San Jose, California, United States (On-Site)
6 Months ago
ByteDance - Senior Backend Software Engineer - Global E-Commerce Supply Chain Operation Platform

ByteDance

San Jose, California, United States (On-Site)
5 Months ago
Maximum Games - Senior Accountant

Maximum Games

Walnut Creek, California, United States (Hybrid)
2 Months ago
Epic Games - Senior Director, Player Experience (EGS)

Epic Games

Cary, North Carolina, United States (On-Site)
3 Months ago

Get notifed when new similar jobs are uploaded

DevOps Jobs

Axon - Senior Site Reliability Engineer II

Axon

Seattle, Washington, United States (Remote)
2 Months ago
Razer - Lead Site Reliability Engineer

Razer

Shanghai, Shanghai, China (On-Site)
6 Months ago
Playtika - Senior Data/AI SRE Engineer

Playtika

Ukraine (On-Site)
5 Months ago
Nielsen Holdings - SENIOR DEVOPS ENGINEER

Nielsen Holdings

Gurugram, Haryana, India (Hybrid)
5 Months ago
Quizizz - Platform Engineer

Quizizz

Bengaluru, Karnataka, India (On-Site)
2 Months ago
N-iX - DevOps/SRE Engineer

N-iX

Poland (Remote)
1 Month ago
Warner Bros Games - Staff Software Engineer - Database Engineer with Aurora Postgres

Warner Bros Games

Bengaluru, Karnataka, India (Hybrid)
2 Months ago
Next Level Business Services - CI/CD with force.com

Next Level Business Services

San Jose, California, United States (On-Site)
5 Months ago
PwC - IN_Senior Associate_DevOps_Application Technology_Advisory_Jaipur

PwC

Jaipur, Rajasthan, India (On-Site)
6 Months ago
Ajmera Infotech - Senior Azure DevOps Engineer (IaaS)

Ajmera Infotech

Bengaluru, Karnataka, India (On-Site)
3 Weeks ago

Get notifed when new similar jobs are uploaded

About The Company

We are a creation lab building foundation models into products that power the next generation of human storytelling

San Francisco, California, United States (On-Site)

New York, New York, United States (On-Site)

New York, New York, United States (On-Site)

New York, New York, United States (On-Site)

New York, New York, United States (On-Site)

New York, New York, United States (On-Site)

New York, New York, United States (On-Site)

New York, New York, United States (On-Site)

New York, New York, United States (On-Site)

New York, New York, United States (On-Site)

View All Jobs

Get notified when new jobs are added by Hedra

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug