SEAL Research Scientist, Frontier Risk Evaluations

2 Months ago • 3 Years + • Research Development • $176,000 PA - $300,000 PA

Job Summary

Job Description

As a Research Scientist focused on Frontier Risk Evaluations, you will design and create evaluation measures, harnesses and datasets for measuring the risks posed by frontier AI systems. This role involves designing and building harnesses to test AI agents for dangerous capabilities, developing and running human-in-the-loop tests of AI capabilities, and collaborating with government agencies or other labs to scope and design evaluations to mitigate risks. The ideal candidate has a commitment to safe AI deployments, practical experience in technical research with frameworks like Pytorch, Jax, or Tensorflow, a track record of published research in machine learning, and strong communication skills. The role requires at least three years of experience addressing sophisticated ML problems.
Must have:
  • Commitment to safe, secure, and trustworthy AI deployments.
  • Practical experience in technical research with frameworks like Pytorch, Jax, or Tensorflow.
  • Published research in machine learning, particularly in generative AI.
  • At least three years of experience addressing sophisticated ML problems.
  • Strong written and verbal communication skills.
Good to have:
  • Hands-on experience with open source LLM fine-tuning.
  • Experience in crafting evaluations or a background in data science related to LLM technologies.
  • Experience working with cloud technology stack (eg. AWS or GCP).

Job Details

As the leading data and evaluation partner for frontier AI companies, Scale plays an integral role in understanding the capabilities and safeguarding large language models (LLMs). Safety, Evaluations and Alignment Lab (SEAL) is Scale’s frontier research effort dedicated to tackling the challenging research problems in evaluation, red teaming, and alignment of advanced AI systems.

We are actively seeking talented researchers to join us in shaping the landscape for safety and transparency for the entire AI industry. We support collaborations across the industry and academia and the publication of our research findings. 

As a Research Scientist focused on Frontier Risk Evaluations, you will design and create evaluation measures, harnesses and datasets for measuring the risks posed by frontier AI systems. For example, you might do any or all of the following: 

  • Design and build harnesses to test AI agents for dangerous capabilities such as hacking or exploiting security vulnerabilities;
  • Develop and run human-in-the-loop tests of AI capabilities to deceive, manipulate, blackmail, or otherwise engage in social engineering;
  • Work with government agencies or other labs to collectively scope and design evaluations to measure and mitigate risks posed by advanced AI systems.

Ideally you’d have:

  • Commitment to our mission of promoting safe, secure, and trustworthy AI deployments in the industry as frontier AI capabilities continue to advance.
  • Practical experience conducting technical research collaboratively, with proficiency in frameworks like Pytorch, Jax, or Tensorflow. You should also be adept at interpreting research literature and quickly turning new ideas into prototypes.
  • A track record of published research in machine learning, particularly in generative AI.
  • At least three years of experience addressing sophisticated ML problems, whether in a research setting or in product development.
  • Strong written and verbal communication skills to operate in a cross functional team.

Nice to have:

  • Hands-on experience with open source LLM fine-tuning or involvement in bespoke LLM fine-tuning projects using Pytorch/Jax.
  • Experience in crafting evaluations or a background in data science roles related to LLM technologies.
  • Experience working with cloud technology stack (eg. AWS or GCP) and developing machine learning models in a cloud environment.

Our research interviews are crafted to assess candidates' skills in practical ML prototyping and debugging, their grasp of research concepts, and their alignment with our organizational culture. We will not ask any LeetCode-style questions.

Similar Jobs

Team Liquid - Talent Manager, North America

Team Liquid

California, United States (On-Site)
3 Months ago
The game - Social Creative Lifestyle

The game

London, England, United Kingdom (Hybrid)
3 Months ago
Ion - Product Owner / Delivery Manager

Ion

Budapest, Hungary (Hybrid)
1 Week ago
Canonical - Junior Cloud Field Engineer

Canonical

(Remote)
1 Month ago
Qualcomm - Display Software Engineer

Qualcomm

Shanghai, China (On-Site)
4 Weeks ago
bytedance - AI Research Scientist Graduate, Large Language Model (Applied Machine Learning) - 2024 Start (PhD)

bytedance

San Jose, California, United States (On-Site)
8 Months ago
London stock Exchange - Lead Research Analyst (Arabic)

London stock Exchange

Penang, Malaysia (Hybrid)
2 Weeks ago
NVIDIA - Principal Engineer - DL and AI Software

NVIDIA

Santa Clara, California, United States (On-Site)
5 Months ago
Apple - Staff Machine Learning Engineer

Apple

Cupertino, California, United States (On-Site)
1 Month ago
bytedance - Machine Learning Engineer - MLDev

bytedance

Seattle, Washington, United States (On-Site)
3 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

WebTech Corporation - Signal Engineer III

WebTech Corporation

Wayne, Pennsylvania, United States (On-Site)
4 Weeks ago
Passive Logic - Building Systems Applications Engineer

Passive Logic

Salt Lake City, Utah, United States (On-Site)
3 Months ago
Techland - Technical Support Specialist

Techland

Ostrów Wielkopolski, Greater Poland Voivodeship, Poland (On-Site)
2 Months ago
Fashionphile - Client Services Specialist

Fashionphile

Lehi, Utah, United States (On-Site)
1 Month ago
kaizen gaming  - Senior Analytics Engineer

kaizen gaming

Thessaloniki, Greece (Hybrid)
2 Weeks ago
Palo Alto Networks - Consulting Director, Incident Recovery Operations (Unit 42)

Palo Alto Networks

California, United States (Remote)
1 Month ago
Ion - Senior Business Consultant - Endur

Ion

Houston, Texas, United States (On-Site)
8 Months ago
BetterMe - HR Generalist

BetterMe

Kyiv, Kyiv City, Ukraine (On-Site)
2 Weeks ago
rivos - Accelerator Microarchitecture Performance Modeling

rivos

Austin, Texas, United States (Remote)
8 Months ago
Remote - Staff Product Designer - Design System

Remote

(Remote)
1 Month ago

Get notifed when new similar jobs are uploaded

Jobs in San Francisco, California, United States

The Walt Disney Company - Housekeeping Room Attendant - Part Time

The Walt Disney Company

Anaheim, California, United States (On-Site)
2 Months ago
Google - Senior Staff Software Engineer, Google Cloud Compute

Google

Sunnyvale, California, United States (On-Site)
2 Months ago
Rocket Science - Producer (Technical Account Manager)

Rocket Science

Albany, New York, United States (Hybrid)
3 Months ago
Patreon - Senior Data Scientist

Patreon

San Francisco, California, United States (Hybrid)
3 Months ago
Probably Monsters - IT Engineer

Probably Monsters

Dallas, Texas, United States (On-Site)
2 Months ago
Avalanche Studios Group - Lead Designer, World

Avalanche Studios Group

Salt Lake City, Utah, United States (Hybrid)
1 Month ago
Vercel - Software Engineer, CI/CD

Vercel

New York, United States (Remote)
1 Month ago
Spaulding Ridge - OneStream Solution Architect

Spaulding Ridge

Chicago, Illinois, United States (On-Site)
2 Months ago
Apple - Audio Firmware Engineer

Apple

Boulder, Colorado, United States (On-Site)
2 Weeks ago
SideFX - Join Our SideFX Talent Community

SideFX

Los Angeles, California, United States (Hybrid)
2 Months ago

Get notifed when new similar jobs are uploaded

Research Development Jobs

ALTEN - Robotics / Artificial Intelligence Engineer Intern

ALTEN

Sèvres, Île-de-France, France (On-Site)
1 Week ago
Game freak - R&D Programmer: Backend

Game freak

Chiyoda City, Tokyo, Japan (On-Site)
5 Months ago
Roblox - Senior Machine Learning Engineer - Content Understanding

Roblox

San Mateo, California, United States (On-Site)
1 Week ago
Apple - Machine Learning Engineer

Apple

Seattle, Washington, United States (On-Site)
1 Month ago
attentive - Senior Machine Learning Engineer

attentive

San Francisco, California, United States (Hybrid)
8 Months ago
Match Group - Senior Machine Learning Engineer

Match Group

Seoul, South Korea (Hybrid)
1 Week ago
Scanline VFX - Research Intern (Fall 2025)

Scanline VFX

Los Angeles, California, United States (Hybrid)
1 Month ago
Unity - Principal Machine Learning Engineer

Unity

United States (Remote)
2 Months ago
ISS Stoxx - Climate Research Analyst / Junior Analyst

ISS Stoxx

Mumbai, Maharashtra, India (On-Site)
2 Months ago
Qualcomm - 2025 Summer Intern - Machine Learning Software Engineer

Qualcomm

Beijing, China (On-Site)
2 Weeks ago

Get notifed when new similar jobs are uploaded

About The Company

New York, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

View All Jobs

Get notified when new jobs are added by Scale AI

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug