Machine Learning Engineer, GenAI Quality

3 Months ago • 3 Years + • Quality Assurance • $172,000 PA - $300,000 PA

Job Summary

Job Description

This role focuses on developing ML systems to automate data quality evaluation and generation using large language models. You will build scalable systems to assess quality across accuracy, instruction adherence, factuality, and reasoning — and design robust evaluation frameworks to ensure alignment with human standards. You will be deeply involved in the full lifecycle: from model design and fine-tuning, to prototyping, deployment, and monitoring. You will partner closely with engineering, research, and product teams to deliver cutting-edge solutions for both customers and internal GenAI data engines.
Must have:
  • 3+ years of experience designing, training, and deploying ML models
  • Strong background in NLP, LLMs, and deep learning frameworks
  • Experience building microservices and deploying ML pipelines
  • Practical knowledge of LLM fine-tuning and evaluation
  • Strong programming skills and a solid foundation in algorithms
Good to have:
  • Experience with post-training LLM techniques
  • Familiarity with data evaluation pipelines, dataset curation
  • Background in multimodal ML or model evaluation

Job Details

About Scale:

Scale’s Generative AI ML team develops models and services to power high-quality data generation and evaluation for the most advanced large language models on earth. We also conduct applied research on model supervision and algorithmic approaches that support frontier models for Scale’s applied-ML teams and the broader AI community. Scale is uniquely positioned at the center of the AI ecosystem as a leading provider of training and evaluation data, end-to-end ML lifecycle solutions, and frontier evaluations for public and private institutions.

About The Role:

This role focuses on developing ML systems to automate data quality evaluation and generation using large language models. You’ll build scalable systems to assess quality across accuracy, instruction adherence, factuality, and reasoning — and design robust evaluation frameworks to ensure alignment with human standards. This is one of the highest impact areas in the company and directly accelerates the development of aligned, performant foundation models.

You’ll be deeply involved in the full lifecycle: from model design and fine-tuning, to prototyping, deployment, and monitoring. You’ll partner closely with engineering, research, and product teams to deliver cutting-edge solutions for both customers and internal GenAI data engines — Scale’s fastest-growing business.

If you’re excited about combining human-machine evaluation, scaling high-quality training data, and shaping the next generation of foundation models, we’d love to hear from you.

You will:

  • Design, fine-tune, and evaluate large language models for structured quality evaluation and data generation tasks
  • Develop robust evaluation frameworks to assess performance across accuracy, instruction following, reasoning, and other critical dimensions
  • Build and maintain scalable ML services to automatically assess and generate high-quality training and evaluation data
  • Research and apply state-of-the-art techniques in LLM training, post-training alignment (e.g., instruction tuning, RLHF), and tool-augmented reasoning
  • Collaborate with research scientists, engineers, and product teams to integrate your work into production services used by top AI developers

Ideally you’d have:

  • 3+ years of experience designing, training, and deploying ML models in production environments
  • Strong background in NLP, LLMs, and deep learning frameworks like PyTorch, TensorFlow, or JAX
  • Experience building microservices and deploying ML pipelines in cloud environments (e.g., AWS or GCP)
  • Practical knowledge of LLM fine-tuning and evaluation for tasks like factuality, instruction adherence, and chain-of-thought reasoning
  • Strong programming skills (e.g., Python) and a solid foundation in algorithms and data structures
  • Strong communication skills and experience working cross-functionally

Nice to haves:

  • Experience with post-training LLM techniques (instruction tuning, RLHF, tool use, or agent-based reasoning)
  • Familiarity with data evaluation pipelines, dataset curation, or scalable annotation workflows
  • Background in multimodal ML or model evaluation across domains such as code or long-context generation

Similar Jobs

YouGov - Account Manager

YouGov

Copenhagen, Denmark (On-Site)
3 Weeks ago
Opendoor - Analyst - Finance & Strategy

Opendoor

Chennai, Tamil Nadu, India (Hybrid)
1 Month ago
Paytm - Manager Legal-PML

Paytm

Mumbai, Maharashtra, India (On-Site)
1 Month ago
Rackspace Technology - Active Directory Engineer III

Rackspace Technology

Pune, Maharashtra, India (On-Site)
1 Month ago
Any Desk - Technical Project Manager

Any Desk

Stuttgart, Baden-Württemberg, Germany (Hybrid)
1 Month ago
Capgemini - Test Automation Lead

Capgemini

Bengaluru, Karnataka, India (On-Site)
2 Months ago
Experian - Intermediate QA Analyst - Affirmative Action for Women

Experian

Blumenau, State Of Santa Catarina, Brazil (On-Site)
1 Month ago
Marvell - Senior Staff System Validation Engineer

Marvell

Santa Clara, California, United States (On-Site)
1 Month ago
Veeam Software - Junior QA Engineer

Veeam Software

Lisbon, Lisbon, Portugal (On-Site)
1 Month ago
hogarth - Lead QA Engineer

hogarth

Sunnyvale, California, United States (Hybrid)
2 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Gunzilla - Senior VFX Artist

Gunzilla

Kyiv, Kyiv City, Ukraine (On-Site)
3 Months ago
Moloco - Senior Manager, Financial Planning & Analysis

Moloco

Redwood City, California, United States (On-Site)
3 Weeks ago
Apple - Senior Writer, Apple Ads

Apple

Culver City, California, United States (On-Site)
1 Month ago
gitlab - Senior Contract Manager

gitlab

United States (Remote)
1 Month ago
Zeeco, Inc. - Applications and Support Engineer - Process Burners

Zeeco, Inc.

Stamford, England, United Kingdom (On-Site)
10 Months ago
Razer - Solutions Architect

Razer

Singapore (On-Site)
10 Months ago
Dentsu - Media Bill Pay Technician

Dentsu

Montreal, Quebec, Canada (On-Site)
2 Months ago
Flow - Senior/Staff Web Engineer

Flow

Palo Alto, California, United States (Hybrid)
9 Months ago
Zamp - GTM Leader (F&A & BFSI)

Zamp

Bengaluru, Karnataka, India (Hybrid)
4 Months ago
zoox - Senior Technical Operations Engineer

zoox

Foster City, California, United States (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Jobs in San Francisco, California, United States

Apple - Cloud Infrastructure Software Developer

Apple

Seattle, Washington, United States (On-Site)
2 Months ago
Glean - Product Designer, Assistant

Glean

Palo Alto, California, United States (On-Site)
9 Months ago
Apple - Software Engineering Manager, Test Software

Apple

San Diego, California, United States (On-Site)
2 Months ago
Dave Ramsey - Business Development Representative

Dave Ramsey

Franklin, Tennessee, United States (On-Site)
2 Weeks ago
Discord - Senior Software Engineer - Desktop Platform

Discord

San Francisco, California, United States (Remote)
3 Months ago
extreme network - Service Inside Account Manager

extreme network

Salem, New Hampshire, United States (Remote)
1 Month ago
Safari AI - 8 Month Computer Vision/Machine Learning Intern

Safari AI

New York, United States (On-Site)
1 Month ago
Adobe - Software Engineer - Cloud Foundation

Adobe

Mountain View, California, United States (On-Site)
1 Month ago
160over90 - Account Director - Partnerships

160over90

New York, New York, United States (On-Site)
3 Months ago
Microsoft - Privacy

Microsoft

United States (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Quality Assurance Jobs

FlockSafety - UAS Test Pilot

FlockSafety

Lafayette, Indiana, United States (On-Site)
1 Month ago
Apple - Wireless RF OTA MIMO Validation Engineer

Apple

Cupertino, California, United States (On-Site)
2 Months ago
Nordson Corporation - Quality Engineer II

Nordson Corporation

Allen, Texas, United States (On-Site)
2 Months ago
Spyke Games - QA Specialist (Disabled)

Spyke Games

İstanbul, Türkiye (On-Site)
9 Months ago
Accenture - Test Automation Lead

Accenture

Bengaluru, Karnataka, India (On-Site)
3 Months ago
Playgendary - QA Specialist

Playgendary

Limassol, Limassol, Cyprus (Remote)
5 Months ago
Apple - AIML - Staff Machine Learning Engineer, Siri Search Quality

Apple

Cupertino, California, United States (On-Site)
2 Months ago
Trellix - Principal Quality Engineer – Engineering Excellence

Trellix

Bengaluru, Karnataka, India (On-Site)
2 Months ago
creative assembly - Senior QA Technician

creative assembly

Horsham, England, United Kingdom (Hybrid)
3 Weeks ago
Valeo - Test & Validation Trainee

Valeo

Bobigny, Île-de-France, France (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded

About The Company

San Francisco, California, United States (Hybrid)

San Francisco, California, United States (Hybrid)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

London, England, United Kingdom (On-Site)

View All Jobs

Get notified when new jobs are added by Scale AI

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug