SEAL Research Scientist, Scalable Oversight

2 Months ago • 3 Years + • Research Development • $176,000 PA - $300,000 PA

Job Summary

Job Description

As a Research Scientist working on Scalable Oversight, you will develop and evaluate methods for evaluation and supervision of advanced AI systems. You will design experiments to exemplify failure modes of current supervision protocols for language models, simulate expertise and capability gaps between supervisor and model for scalable oversight experiments, develop new supervision protocols and gather human annotations using these protocols, and train language models using reinforcement learning, analyzing their behavior and comparing between models. You will be working on the cutting edge of AI safety and helping shape the future of the AI industry.
Must have:
  • Experience in conducting technical research collaboratively.
  • Proficiency in frameworks like Pytorch, Jax, or Tensorflow.
  • A track record of published research in machine learning, particularly in generative AI.
  • At least three years of experience addressing sophisticated ML problems.
  • Strong written and verbal communication skills.
Good to have:
  • Hands-on experience with open source LLM fine-tuning.
  • Experience in crafting evaluations or a background in data science roles related to LLM technologies.
  • Experience working with cloud technology stack (eg. AWS or GCP).

Job Details

As the leading data and evaluation partner for frontier AI companies, Scale plays an integral role in understanding the capabilities and safeguarding large language models (LLMs). Safety, Evaluations and Alignment Lab (SEAL) is Scale’s frontier research effort dedicated to tackling the challenging research problems in evaluation, red teaming, and alignment of advanced AI systems.

We are actively seeking talented researchers to join us in shaping the landscape for safety and transparency for the entire AI industry. We support collaborations across the industry and academia and the publication of our research findings. 

As a Research Scientist working on Scalable Oversight, you will develop and evaluate methods for evaluation and supervision of advanced AI systems. For example, you might do any or all of the following: 

  • Design experiments to exemplify failure modes of current supervision protocols for language models 
  • Design experiments to simulate expertise and capability gaps between supervisor and model for scalable oversight experiments
  • Develop new supervision protocols and gather human annotations using these protocols 
  • Train language models using reinforcement learning, analyzing their behavior and comparing between models 

Ideally you’d have:

  • Commitment to our mission of promoting safe, secure, and trustworthy AI deployments in the industry as frontier AI capabilities continue to advance.
  • Practical experience conducting technical research collaboratively, with proficiency in frameworks like Pytorch, Jax, or Tensorflow. You should also be adept at interpreting research literature and quickly turning new ideas into prototypes.
  • A track record of published research in machine learning, particularly in generative AI.
  • At least three years of experience addressing sophisticated ML problems, whether in a research setting or in product development.
  • Strong written and verbal communication skills to operate in a cross functional team.

Nice to have:

  • Hands-on experience with open source LLM fine-tuning or involvement in bespoke LLM fine-tuning projects using Pytorch/Jax.
  • Experience in crafting evaluations or a background in data science roles related to LLM technologies.
  • Experience working with cloud technology stack (eg. AWS or GCP) and developing machine learning models in a cloud environment.

Our research interviews are crafted to assess candidates' skills in practical ML prototyping and debugging, their grasp of research concepts, and their alignment with our organizational culture. We will not ask any LeetCode-style questions.

Similar Jobs

Maliyo Games - Data Analyst

Maliyo Games

Nigeria (On-Site)
7 Months ago
Monzo - Lead Product Manager, Search & Recommendations

Monzo

London, England, United Kingdom (Remote)
1 Month ago
BetterMe - Legal Counsel

BetterMe

Ukraine (Remote)
1 Month ago
Crunchyroll - Senior Software Engineer, Membership

Crunchyroll

San Francisco, California, United States (Hybrid)
1 Month ago
Thales - Procurement Category Manager

Thales

Belfast, Northern Ireland, United Kingdom (Hybrid)
1 Month ago
Autodesk - Principal ML Engineer

Autodesk

Toronto, Ontario, Canada (Hybrid)
1 Month ago
bytedance - Research Scientist Graduate (Foundation Model - Generative AI) - 2025 Start (PhD)

bytedance

Seattle, Washington, United States (On-Site)
6 Months ago
Reddit - Senior Software Engineer, AI Enablement

Reddit

Canada (Remote)
1 Month ago
Inwave - R&D Specialist

Inwave

(On-Site)
4 Months ago
bytedance - Large Language Model Research Scientist Graduate (Doubao-Seed) - 2024 Start (PhD)

bytedance

San Jose, California, United States (On-Site)
8 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Biestas - Business Developer

Biestas

New York, New York, United States (On-Site)
2 Months ago
Axon - Commercial Operations Analyst - Quoting

Axon

San Francisco, California, United States (On-Site)
1 Month ago
zeta - Lead - Business Intelligence Engineer

zeta

Bengaluru, Karnataka, India (On-Site)
8 Months ago
Rackspace Technology - .NET Application Architect (India Night Shift)

Rackspace Technology

India (Remote)
2 Weeks ago
disbielief - Senior Producer

disbielief

Cambridge, Massachusetts, United States (Hybrid)
1 Month ago
dun bradstreet - Solution Sales Manager

dun bradstreet

Frankfurt Am Main, Hessen, Germany (Hybrid)
1 Year ago
Wind River - Industry Marketing Manager – Automotive

Wind River

Troy, Michigan, United States (On-Site)
1 Month ago
Ubisoft - Junior R&D Engineer

Ubisoft

Pune, Maharashtra, India (Hybrid)
3 Weeks ago
Hashlist - ADAS Feature Architect

Hashlist

Pune, Maharashtra, India (Hybrid)
9 Months ago
Haptic  - Design Director

Haptic

Paris, Île-de-France, France (Remote)
6 Months ago

Get notifed when new similar jobs are uploaded

Jobs in San Francisco, California, United States

bytedance - Site Reliability Engineer, Edge Services

bytedance

Boston, Massachusetts, United States (On-Site)
4 Months ago
GoMotive - Senior Sales Engineer, Strategic - TX

GoMotive

United States (Remote)
3 Months ago
Epic Games - Web Engineer

Epic Games

Cary, North Carolina, United States (On-Site)
5 Months ago
PayPal - Sr Manager, Partnerships

PayPal

New York, New York, United States (Hybrid)
1 Month ago
Windranger - Senior Fullstack Engineer

Windranger

El Paso, Texas, United States (Remote)
3 Months ago
Car Gurus - Product Manager, Data Science & Machine Learning

Car Gurus

Boston, Massachusetts, United States (Hybrid)
1 Month ago
Gupta Media - Media Analyst

Gupta Media

New York, New York, United States (On-Site)
2 Months ago
Daybreak Game Company LLC - Executive Producer - MTGO

Daybreak Game Company LLC

San Diego, California, United States (Hybrid)
7 Months ago
The New York Times - Senior Video Journalist

The New York Times

New York, New York, United States (Hybrid)
2 Months ago
rivos - CPU Power Engineer

rivos

Santa Clara, California, United States (Hybrid)
3 Years ago

Get notifed when new similar jobs are uploaded

Research Development Jobs

Keywords Studios - AI - Project Lead (Prompts)

Keywords Studios

Silesian Voivodeship, Poland (On-Site)
3 Months ago
Apple - AIML Triage and Diagnostic Tooling Engineer, AIML Integration and Delivery

Apple

Santa Clara, California, United States (On-Site)
1 Month ago
Bosch Group - Gen AI Specialist

Bosch Group

Bengaluru, Karnataka, India (On-Site)
1 Week ago
NVIDIA - Deep Learning Intern - Fall 2025

NVIDIA

Shanghai, Shanghai, China (On-Site)
2 Months ago
Playtika - R&D Director

Playtika

Poland (Hybrid)
4 Months ago
Ramboll3 - Lead Machine Learning Engineer

Ramboll3

Gurugram, Haryana, India (Hybrid)
1 Month ago
NVIDIA - Principal Engineer - DL and AI Software

NVIDIA

Santa Clara, California, United States (On-Site)
5 Months ago
LeoVegas - AI Transformation Lead

LeoVegas

Sliema, Malta (Hybrid)
1 Week ago
Capgemini - AI & Analytics

Capgemini

Noida, Uttar Pradesh, India (On-Site)
2 Months ago
Ubisoft - Principal R&D Scientist on Bots & Behaviors

Ubisoft

Bordeaux, Nouvelle-Aquitaine, France (Hybrid)
3 Months ago

Get notifed when new similar jobs are uploaded

About The Company

New York, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

View All Jobs

Get notified when new jobs are added by Scale AI

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug