SEAL Research Scientist, Scalable Oversight

1 Month ago • 3 Years + • $176,000 PA - $300,000 PA

Job Summary

Job Description

As a Research Scientist working on Scalable Oversight, you will develop and evaluate methods for evaluation and supervision of advanced AI systems. You will design experiments to exemplify failure modes of current supervision protocols for language models, simulate expertise and capability gaps between supervisor and model for scalable oversight experiments, develop new supervision protocols and gather human annotations using these protocols, and train language models using reinforcement learning, analyzing their behavior and comparing between models. You will be working on the cutting edge of AI safety and helping shape the future of the AI industry.
Must have:
  • Experience in conducting technical research collaboratively.
  • Proficiency in frameworks like Pytorch, Jax, or Tensorflow.
  • A track record of published research in machine learning, particularly in generative AI.
  • At least three years of experience addressing sophisticated ML problems.
  • Strong written and verbal communication skills.
Good to have:
  • Hands-on experience with open source LLM fine-tuning.
  • Experience in crafting evaluations or a background in data science roles related to LLM technologies.
  • Experience working with cloud technology stack (eg. AWS or GCP).

Job Details

As the leading data and evaluation partner for frontier AI companies, Scale plays an integral role in understanding the capabilities and safeguarding large language models (LLMs). Safety, Evaluations and Alignment Lab (SEAL) is Scale’s frontier research effort dedicated to tackling the challenging research problems in evaluation, red teaming, and alignment of advanced AI systems.

We are actively seeking talented researchers to join us in shaping the landscape for safety and transparency for the entire AI industry. We support collaborations across the industry and academia and the publication of our research findings. 

As a Research Scientist working on Scalable Oversight, you will develop and evaluate methods for evaluation and supervision of advanced AI systems. For example, you might do any or all of the following: 

  • Design experiments to exemplify failure modes of current supervision protocols for language models 
  • Design experiments to simulate expertise and capability gaps between supervisor and model for scalable oversight experiments
  • Develop new supervision protocols and gather human annotations using these protocols 
  • Train language models using reinforcement learning, analyzing their behavior and comparing between models 

Ideally you’d have:

  • Commitment to our mission of promoting safe, secure, and trustworthy AI deployments in the industry as frontier AI capabilities continue to advance.
  • Practical experience conducting technical research collaboratively, with proficiency in frameworks like Pytorch, Jax, or Tensorflow. You should also be adept at interpreting research literature and quickly turning new ideas into prototypes.
  • A track record of published research in machine learning, particularly in generative AI.
  • At least three years of experience addressing sophisticated ML problems, whether in a research setting or in product development.
  • Strong written and verbal communication skills to operate in a cross functional team.

Nice to have:

  • Hands-on experience with open source LLM fine-tuning or involvement in bespoke LLM fine-tuning projects using Pytorch/Jax.
  • Experience in crafting evaluations or a background in data science roles related to LLM technologies.
  • Experience working with cloud technology stack (eg. AWS or GCP) and developing machine learning models in a cloud environment.

Our research interviews are crafted to assess candidates' skills in practical ML prototyping and debugging, their grasp of research concepts, and their alignment with our organizational culture. We will not ask any LeetCode-style questions.

Similar Jobs

bytedance - Research Scientist Graduate (Foundation Model Speech & Audio Generation)

bytedance

Seattle, Washington, United States (On-Site)
1 Month ago
Outbrain - Data Science Summer School

Outbrain

Paris, Île-de-France, France (On-Site)
1 Month ago
Ubisoft - Scientifique principal en données ML _ Groupe Technologique Content Creation

Ubisoft

Montreal, Quebec, Canada (On-Site)
5 Months ago
Capgemini - Generative AI Developer

Capgemini

Hyderabad, Telangana, India (On-Site)
1 Week ago
NVIDIA - Senior Site Reliability Engineer - AI Research Clusters

NVIDIA

Austin, Texas, United States (Hybrid)
3 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

PrizePicks - Staff Data Science Engineer

PrizePicks

Atlanta, Georgia, United States (Remote)
1 Month ago
Granicus - Data Scientist 4

Granicus

Bengaluru, Karnataka, India (Hybrid)
7 Months ago
Scale AI - Machine Learning Research Scientist/ Engineer, Agents

Scale AI

San Francisco, California, United States (On-Site)
1 Month ago
Meta - Software Engineer, Machine Learning

Meta

Pittsburgh, Pennsylvania, United States (On-Site)
6 Months ago
Meta - Software Engineer, Machine Learning

Meta

Bellevue, Washington, United States (On-Site)
6 Months ago
Enphase Energy - Senior Software Engineer: Data Science & Optimisation

Enphase Energy

Bengaluru, Karnataka, India (On-Site)
4 Months ago
Scale AI - Software Engineer, Public Sector

Scale AI

San Francisco, California, United States (On-Site)
2 Weeks ago
Aristocrat Gaming - Automation and AI Engineer (Unity & Backend Technologies)

Aristocrat Gaming

Barcelona, Catalonia, Spain (Hybrid)
1 Month ago
Netflix - Research Scientist (L6) - Identity Algorithms

Netflix

Los Gatos, California, United States (On-Site)
7 Months ago
Trendyol - Data Science Professionals - Trendyol GO

Trendyol

İzmir, İzmir, Türkiye (Hybrid)
6 Months ago

Get notifed when new similar jobs are uploaded

Jobs in San Francisco, California, United States

CrowdStricke - Sales Development Representative, Portuguese Speaking

CrowdStricke

Austin, Texas, United States (Hybrid)
2 Weeks ago
onwards Search - Digital Art Director

onwards Search

New York, New York, United States (Hybrid)
3 Weeks ago
NBC Universal - VP, Consolidations and Controllership

NBC Universal

Englewood Cliffs, New Jersey, United States (Hybrid)
1 Month ago
Moloco - Machine Learning Engineer

Moloco

Redwood City, California, United States (On-Site)
1 Week ago
extreme network - Sr. SLED Systems Engineer

extreme network

Washington, United States (Remote)
6 Months ago
Snail Games - Bilingual Translator & Executive Assistant (English/Mandarin)

Snail Games

Beverly Hills, California, United States (On-Site)
7 Months ago
Hudl - Close & Consolidations Manager

Hudl

Omaha, Nebraska, United States (Hybrid)
5 Days ago
ManyChat - Senior Marketing Analyst

ManyChat

Austin, Texas, United States (Hybrid)
4 Days ago
Guardian - Group Insurance Implementation Team Leader

Guardian

Bethlehem, Pennsylvania, United States (On-Site)
1 Month ago
ManyChat - Director of Partnerships

ManyChat

Austin, Texas, United States (Hybrid)
4 Days ago

Get notifed when new similar jobs are uploaded

Similar Category Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

About The Company

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

Seattle, Washington, United States (Remote)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

View All Jobs

Get notified when new jobs are added by Scale AI

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug