Software Engineer, Machine Learning Infrastructure

2 Months ago • 4 Years + • Artificial Intelligence

Job Summary

Job Description

Character.AI seeks a seasoned ML Infrastructure engineer to design, build, and maintain training and serving infrastructure for ML research and product development. Responsibilities include providing infrastructure support for ML research, building tooling for diagnosing cluster issues and hardware failures, monitoring deployments, managing experiments, and maximizing GPU allocation and utilization. The ideal candidate possesses 4+ years of experience supporting ML infrastructure, developing diagnostic tools, and working with cloud platforms like Compute Engine, Kubernetes, and Cloud Storage. Experience with GPUs is essential.
Must have:
  • 4+ years supporting ML infrastructure
  • Develop diagnostic tools for ML infrastructure
  • Experience with cloud platforms (Compute Engine, Kubernetes, Cloud Storage)
  • GPU experience
Good to have:
  • Large GPU clusters and high-performance computing/networking
  • Large language model training support
  • ML frameworks (Pytorch/TensorFlow/JAX)
  • GPU kernel development

Job Details

About the role

We’re looking for seasoned ML Infrastructure engineers with experience designing, building and maintaining training and serving infrastructure for ML research.

Responsibilities:

  • Provide infrastructure support to our ML research and product

  • Build tooling to diagnose cluster issues and hardware failures

  • Monitor deployments, manage experiments, and generally support our research

  • Maximize GPU allocation and utilization for both serving and training

Requirements:

  • 4+ years of experience supporting the infrastructure within an ML environment

  • Experience in developing tools used to diagnose ML infrastructure problems and failures

  • Experience with cloud platforms (e.g., Compute Engine, Kubernetes, Cloud Storage)

  • Experience working with GPUs

Nice to have

  • Experience with large GPU clusters and high-performance computing/networking

  • Experience with supporting large language model training

  • Experience with ML frameworks like Pytorch/TensorFlow/JAX

  • Experience with GPU kernel development

About Character.AI

Founded in 2021, Character is a leading AI company offering personalized experiences through customizable AI 'Characters.' As one of the most widely used AI platforms worldwide, Character enables users to interact with AI tailored to their unique needs and preferences.

In just two years, we achieved unicorn status and were named Google Play's AI App of the Year – a testament to our groundbreaking technology and vision.

Ready to shape the future of Consumer AI? 🚀

At Character, we value diversity and welcome applicants from all backgrounds. As an equal opportunity employer, we firmly uphold a non-discrimination policy based on race, religion, national origin, gender, sexual orientation, age, veteran status, or disability. Your unique perspectives are vital to our success.

Compensation Range: $150K - $350K

Similar Jobs

Inkittt - Senior Machine Learning Engineer, Recommendations

Inkittt

San Francisco, California, United States (Hybrid)
2 Weeks ago
Microsoft - Data Scientist II

Microsoft

Hyderabad, Telangana, India (On-Site)
1 Month ago
Coursera - AI Specialist

Coursera

India (Remote)
6 Days ago
Luxoft - Senior/Lead Machine Learning and Image Processing Specialist

Luxoft

Italy, New York, United States (Remote)
2 Months ago
Blizzard Entertainment - Senior Manager, Data Science

Blizzard Entertainment

United States (Hybrid)
3 Months ago
Google - Software Engineer III, Machine Learning, Pixel Camera

Google

New Taipei, New Taipei City, Taiwan (On-Site)
1 Month ago
ByteDance - Research Engineer Intern (Doubao (Seed) - Machine Learning System) - 2025 Summer (MS)

ByteDance

San Jose, California, United States (On-Site)
3 Months ago
QUANTIC DREAM - AI Programmer

QUANTIC DREAM

Montreal, Quebec, Canada (Hybrid)
3 Months ago
Interface AI - Sr. Implementation Engineer

Interface AI

United States (Remote)
2 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Microsoft - Research Intern - AI-driven Hardware Design

Microsoft

Redmond, Washington, United States (On-Site)
1 Month ago
Microsoft - Senior Software Engineer - LLM Performance

Microsoft

(On-Site)
3 Weeks ago
Autodesk - Intern, AI Research Scientist - 3D Generation

Autodesk

London, England, United Kingdom (On-Site)
3 Months ago
Netflix - Product Manager, ML Platform: Training

Netflix

Los Gatos, California, United States (Hybrid)
3 Months ago
PropReturns - Senior Data Scientist

PropReturns

Maharashtra, India (On-Site)
5 Months ago
Ubisoft - Scientifique en données ML Senior _ Groupe Technologique Content Creation

Ubisoft

Montreal, Quebec, Canada (On-Site)
2 Weeks ago
Whoop - Signal Processing Engineer II

Whoop

Boston, Massachusetts, United States (On-Site)
3 Months ago
ByteDance - Video Analysis and Quality Algorithm Intern 2023 Summer/Fall (PHD)

ByteDance

San Jose, California, United States (On-Site)
3 Months ago
Nintendo - Intern – Machine Learning Software Engineer (NTD)

Nintendo

Redmond, Washington, United States (On-Site)
2 Months ago
ByteDance - Research Scientist, Foundation Model, Speech & Audio

ByteDance

Seattle, Washington, United States (On-Site)
3 Months ago

Get notifed when new similar jobs are uploaded

Jobs in New York, New York, United States

SSC Technologies - Sales Director Alternative Investments

SSC Technologies

New York, New York, United States (Hybrid)
3 Months ago
Next Level Business Services - Cassandra

Next Level Business Services

Jersey City, New Jersey, United States (On-Site)
3 Months ago
Unity - Développeur(se) Senior Back-End, Data Services | Senior Backend Developer, Data Services

Unity

San Francisco, California, United States (On-Site)
2 Months ago
Crunchyroll - iOS Engineering Manager

Crunchyroll

Dallas, Texas, United States (Remote)
2 Months ago
Fliff  Inc  - Senior Financial Controller

Fliff Inc

Philadelphia, Pennsylvania, United States (On-Site)
6 Months ago
The Walt Disney Company - Senior Database Engineer - Oracle

The Walt Disney Company

Bristol, Connecticut, United States (On-Site)
3 Months ago
Joyride Games - UI/UX Designer

Joyride Games

Palo Alto, California, United States (Remote)
1 Year ago
Meta - ASIC Engineer, Design

Meta

Menlo Park, California, United States (On-Site)
2 Months ago
Netflix - Associate, Finance & Strategy, Products & Technology - Content Business Product

Netflix

Los Gatos, California, United States (On-Site)
3 Months ago
Netflix - Localization Producer, UCAN Nonfiction

Netflix

Los Angeles, California, United States (On-Site)
3 Weeks ago

Get notifed when new similar jobs are uploaded

Artificial Intelligence Jobs

Microsoft - Applied Scientist: Microsoft AI Development Acceleration Program, Redmond

Microsoft

Redmond, Washington, United States (On-Site)
1 Month ago
Ubisoft - ML OPS Senior _ Groupe Technologique Création de contenu

Ubisoft

Montreal, Quebec, Canada (On-Site)
1 Month ago
Zoox - Software Engineer - 3D World Generation Pipelines

Zoox

Seattle, Washington, United States (Hybrid)
3 Months ago
Casumo - AI Engineer

Casumo

(Hybrid)
1 Month ago
Microsoft - Research Intern - AI Frontiers - Foundation Model Evaluation and Understanding

Microsoft

Redmond, Washington, United States (On-Site)
1 Month ago
Intel Corporation - AI Frameworks Engineer

Intel Corporation

Bengaluru, Karnataka, India (Hybrid)
2 Months ago
Orion Innovation - Data Engineer-AI,ML

Orion Innovation

Chennai, Tamil Nadu, India (On-Site)
3 Months ago
Keywords Studios (Player Support) - Technical Research Associate - AI

Keywords Studios (Player Support)

Katowice, Silesian Voivodeship, Poland (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

About The Company

Character is one of the world's leading personal AI platforms. Founded in 2021 by AI pioneers Noam Shazeer and Daniel De Freitas, Character is a full-stack AI company with a globally scaled direct-to-consumer platform. 

Menlo Park, California, United States (On-Site)

New York, New York, United States (On-Site)

Menlo Park, California, United States (On-Site)

Menlo Park, California, United States (On-Site)

Menlo Park, California, United States (On-Site)

Menlo Park, California, United States (On-Site)

Menlo Park, California, United States (On-Site)

Menlo Park, California, United States (On-Site)

Menlo Park, California, United States (On-Site)

Menlo Park, California, United States (On-Site)

View All Jobs

Get notified when new jobs are added by Character.AI

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug