ML Research Engineer, ML Systems

1 Day ago • All levels • $200,800 PA - $251,000 PA

Job Summary

Job Description

The ML platform (RLXF) team at Scale builds an internal distributed framework for large language model training and inference. This platform supports MLEs, researchers, data scientists, and operators in efficiently training and evaluating LLMs, as well as assessing data quality. As an ML Research Engineer, you will collaborate with Scale's ML teams and researchers, contributing to the platform that supports ML research and development. Your role involves optimizing the platform to enable the next generation of LLM training, inference, and data curation.
Must have:
  • Experience with multi-node LLM training and inference
  • Experience with developing large-scale distributed ML systems
  • Strong software engineering skills, proficient in CUDA, Pytorch, transformers, etc.
  • Strong written and verbal communication skills
Good to have:
  • Demonstrated expertise in post-training methods
  • Experience with next generation use cases for large language models

Job Details

Scale’s ML platform (RLXF) team builds our internal distributed framework for large language model training and inference. The platform has been powering MLEs, researchers, data scientists and operators for fast and automatic training and evaluation of LLM's, as well as evaluation of data quality.

Scale is uniquely positioned at the heart of the field of AI as an indispensable provider of training and evaluation data and end-to-end solutions for the ML lifecycle. You will work closely across Scale’s ML teams and researchers to build the foundation platform that supports all our ML research and development. You will be building and optimizing the platform to enable our next generation of LLM training, inference and data curation.

If you are excited about shaping the future AI via fundamental innovations, we would love to hear from you!

You will:

  • Build, profile and optimize our training and inference framework
  • Collaborate with ML teams to accelerate their research and development and enable them to develop the next generation of models and data curation
  • Research and integrate state-of-the-art technologies to optimize our ML system

Ideally you’d have:

  • Strong excitement about system optimization
  • Experience with multi-node LLM training and inference
  • Experience with developing large-scale distributed ML systems
  • Strong software engineering skills, proficient in frameworks and tools such as CUDA, Pytorch, transformers, flash attention, etc. 
  • Strong written and verbal communication skills and the ability to operate in a cross functional team environment

Nice to haves:

  • Demonstrated expertise in post-training methods &/or next generation use cases for large language models including instruction tuning, RLHF, tool use, reasoning, agents, and multimodal, etc.



Similar Jobs

NVIDIA - Research Scientist, Deep Learning and Computer Vision

NVIDIA

Hsinchu, Hsinchu City, Taiwan (On-Site)
2 Months ago
Attentive - Senior Machine Learning Engineer

Attentive

San Francisco, California, United States (Hybrid)
6 Months ago
NVIDIA - Senior Software Engineer, AI Resiliency

NVIDIA

Redmond, Washington, United States (On-Site)
1 Month ago
Adyen - Senior Machine Learning Scientist

Adyen

Amsterdam, North Holland, Netherlands (On-Site)
9 Hours ago
SmileGate - AI Developer ([LOST ARK Mobile])

SmileGate

Seongnam-si, Gyeonggi-do, South Korea (On-Site)
3 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Meta - Software Engineer, Machine Learning

Meta

Mountain View, California, United States (On-Site)
5 Months ago
Google - Customer Engineer, Applied and Generative AI

Google

Jakarta, Jakarta, Indonesia (On-Site)
2 Days ago
Fairmatic - Senior Data Scientist

Fairmatic

Tel Aviv-Yafo, Tel Aviv District, Israel (Hybrid)
6 Months ago
Naughty Dog - Senior Technical Artist

Naughty Dog

Santa Monica, California, United States (Hybrid)
1 Day ago
Well - Senior Machine Learning Engineer

Well

New York, New York, United States (On-Site)
1 Day ago
Ego.ai - Applied AI Postdoctoral Researcher, 3D Embodied Agents and Gaming

Ego.ai

San Francisco, California, United States (On-Site)
1 Day ago
Ubisoft - Machine Learning Programmer (Character & Animation)

Ubisoft

Montreal, Quebec, Canada (On-Site)
1 Month ago
ByteDance - Senior Software Engineer - Generative AI

ByteDance

San Jose, California, United States (On-Site)
3 Months ago
Axon - Machine Learning Engineer II

Axon

(Remote)
8 Hours ago
Canva - Machine Learning Engineer Lead - User Voice

Canva

Sydney, New South Wales, Australia (Remote)
3 Weeks ago

Get notifed when new similar jobs are uploaded

Jobs in San Francisco, California, United States

Zoox - Technical Program Manager - Artificial Intelligence

Zoox

Foster City, California, United States (Hybrid)
6 Months ago
ByteDance - Senior Data Scientist - Global E-Commerce - Supply Chain & Logistics

ByteDance

Seattle, Washington, United States (On-Site)
6 Months ago
Meta - Production Engineer

Meta

Menlo Park, California, United States (Remote)
5 Months ago
Meta - Software Engineering Manager, Machine Learning

Meta

Bellevue, Washington, United States (On-Site)
2 Weeks ago
Varonis  - Full Stack Software Engineer (C#)

Varonis

Morrisville, North Carolina, United States (On-Site)
3 Months ago
Snail Games - Game Scout (Business Development)

Snail Games

Beverly Hills, California, United States (On-Site)
5 Months ago
NVIDIA - Senior Software QA Test Development Engineer

NVIDIA

Santa Clara, California, United States (On-Site)
3 Weeks ago
Scanline VFX - Stage Technical Artist

Scanline VFX

Los Angeles, California, United States (On-Site)
4 Weeks ago
Glean - Enterprise Account Executive - Illinois

Glean

Illinois, United States (Remote)
5 Months ago
Miracle Software System - Software Engineer

Miracle Software System

Novi, Michigan, United States (On-Site)
6 Years ago

Get notifed when new similar jobs are uploaded

Similar Category Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

About The Company

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

Doha, Doha Municipality, Qatar (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

View All Jobs

Get notified when new jobs are added by Scale AI

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug