SEAL Research Scientist, Frontier Risk Evaluations

3 Months ago • 3 Years + • Research Development • $176,000 PA - $300,000 PA

Job Summary

Job Description

As a Research Scientist focused on Frontier Risk Evaluations, you will design and create evaluation measures, harnesses and datasets for measuring the risks posed by frontier AI systems. This role involves designing and building harnesses to test AI agents for dangerous capabilities, developing and running human-in-the-loop tests of AI capabilities, and collaborating with government agencies or other labs to scope and design evaluations to mitigate risks. The ideal candidate has a commitment to safe AI deployments, practical experience in technical research with frameworks like Pytorch, Jax, or Tensorflow, a track record of published research in machine learning, and strong communication skills. The role requires at least three years of experience addressing sophisticated ML problems.
Must have:
  • Commitment to safe, secure, and trustworthy AI deployments.
  • Practical experience in technical research with frameworks like Pytorch, Jax, or Tensorflow.
  • Published research in machine learning, particularly in generative AI.
  • At least three years of experience addressing sophisticated ML problems.
  • Strong written and verbal communication skills.
Good to have:
  • Hands-on experience with open source LLM fine-tuning.
  • Experience in crafting evaluations or a background in data science related to LLM technologies.
  • Experience working with cloud technology stack (eg. AWS or GCP).

Job Details

As the leading data and evaluation partner for frontier AI companies, Scale plays an integral role in understanding the capabilities and safeguarding large language models (LLMs). Safety, Evaluations and Alignment Lab (SEAL) is Scale’s frontier research effort dedicated to tackling the challenging research problems in evaluation, red teaming, and alignment of advanced AI systems.

We are actively seeking talented researchers to join us in shaping the landscape for safety and transparency for the entire AI industry. We support collaborations across the industry and academia and the publication of our research findings. 

As a Research Scientist focused on Frontier Risk Evaluations, you will design and create evaluation measures, harnesses and datasets for measuring the risks posed by frontier AI systems. For example, you might do any or all of the following: 

  • Design and build harnesses to test AI agents for dangerous capabilities such as hacking or exploiting security vulnerabilities;
  • Develop and run human-in-the-loop tests of AI capabilities to deceive, manipulate, blackmail, or otherwise engage in social engineering;
  • Work with government agencies or other labs to collectively scope and design evaluations to measure and mitigate risks posed by advanced AI systems.

Ideally you’d have:

  • Commitment to our mission of promoting safe, secure, and trustworthy AI deployments in the industry as frontier AI capabilities continue to advance.
  • Practical experience conducting technical research collaboratively, with proficiency in frameworks like Pytorch, Jax, or Tensorflow. You should also be adept at interpreting research literature and quickly turning new ideas into prototypes.
  • A track record of published research in machine learning, particularly in generative AI.
  • At least three years of experience addressing sophisticated ML problems, whether in a research setting or in product development.
  • Strong written and verbal communication skills to operate in a cross functional team.

Nice to have:

  • Hands-on experience with open source LLM fine-tuning or involvement in bespoke LLM fine-tuning projects using Pytorch/Jax.
  • Experience in crafting evaluations or a background in data science roles related to LLM technologies.
  • Experience working with cloud technology stack (eg. AWS or GCP) and developing machine learning models in a cloud environment.

Our research interviews are crafted to assess candidates' skills in practical ML prototyping and debugging, their grasp of research concepts, and their alignment with our organizational culture. We will not ask any LeetCode-style questions.

Similar Jobs

Tencent - Data Scientist

Tencent

(On-Site)
5 Months ago
bytedance - Security Systems Engineer, Fleet Management

bytedance

Singapore (On-Site)
6 Months ago
Loft Orbital - Senior Sales Systems Engineer

Loft Orbital

Toulouse, Occitanie, France (On-Site)
5 Months ago
The Walt Disney Company - Chief Officer

The Walt Disney Company

(On-Site)
5 Months ago
hogarth - Head of Studio Delivery

hogarth

Manila, Metro Manila, Philippines (Hybrid)
1 Month ago
Apple - Machine Learning Engineer - Apple Vision Pro

Apple

Sunnyvale, California, United States (On-Site)
1 Month ago
Scale AI - Machine Learning Research Scientist / Research Engineer, MLDG

Scale AI

San Francisco, California, United States (On-Site)
9 Months ago
Apple - Sr. Applied ML Engineer

Apple

Seattle, Washington, United States (On-Site)
2 Months ago
C3 IoT - Pre-Sales AI Director (German Speaking)

C3 IoT

London, England, United Kingdom (On-Site)
3 Weeks ago
EMA - Machine Learning Engineer

EMA

Bengaluru, Karnataka, India (Hybrid)
6 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

binance - Compliance Analyst - KYB

binance

Lisbon, Lisbon, Portugal (Remote)
6 Months ago
Adyen - Internal Auditor (Finance)

Adyen

Amsterdam, North Holland, Netherlands (On-Site)
4 Weeks ago
Banyan Software - Product Architect (UI/UX)

Banyan Software

United States (Remote)
4 Weeks ago
Apple - Senior Financial Analyst

Apple

Sunnyvale, California, United States (On-Site)
1 Month ago
NinjaVan - Account Executive

NinjaVan

Makati City, Metro Manila, Philippines (Hybrid)
1 Month ago
Netomi - Implementation Consultant

Netomi

United States (Remote)
9 Months ago
NetEase Games - Finance Director (SSC)-Canada

NetEase Games

Montreal, Quebec, Canada (On-Site)
7 Months ago
Tencent - Strategic Investment Manager - Japan Market

Tencent

Shenzhen, Guangdong Province, China (On-Site)
5 Months ago
Any Desk - Backend Engineer – Core Services

Any Desk

Stuttgart, Baden-Württemberg, Germany (On-Site)
2 Months ago
Philips - Senior Program Manager

Philips

Pune, Maharashtra, India (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Jobs in San Francisco, California, United States

Sonar Source - Solutions Engineer

Sonar Source

Austin, Texas, United States (On-Site)
5 Months ago
Next Level Business Services - SFDC Senior  Developer

Next Level Business Services

Parsippany-Troy Hills, New Jersey, United States (On-Site)
9 Months ago
A-Team - Product Marketing Manager - AI

A-Team

New York, United States (Hybrid)
1 Month ago
GoMotive - Senior Enablement Business Partner

GoMotive

United States (Remote)
4 Weeks ago
Reddit - Senior Client Account Manager, Large Customer Sales (Restaurants & AlcBev)

Reddit

New York, United States (On-Site)
4 Weeks ago
Scopely - Senior Fullstack Engineer

Scopely

Culver City, California, United States (Hybrid)
1 Month ago
Square - Domino's Delivery Driver

Square

Yakima, Washington, United States (On-Site)
1 Month ago
Passive Logic - Senior Embedded Systems Engineer (Wireless Product Development)

Passive Logic

Holladay, Utah, United States (On-Site)
4 Months ago
Zones - Logistics Shipping Coordinator

Zones

Carol Stream, Illinois, United States (On-Site)
4 Months ago
Rippling - Staff Software Engineer - Device Products

Rippling

Seattle, Washington, United States (On-Site)
4 Weeks ago

Get notifed when new similar jobs are uploaded

Research Development Jobs

Adyen - Founding Research Engineer, AI

Adyen

San Francisco, California, United States (On-Site)
4 Weeks ago
Sprinkler - AI Delivery Manager

Sprinkler

Gurugram, Haryana, India (On-Site)
3 Months ago
Tide - Lead Machine Learning Engineer (MLOps)

Tide

Bulgaria (Remote)
4 Weeks ago
Flexra Software - Machine Learning Lead

Flexra Software

Tel Aviv-Yafo, Tel Aviv District, Israel (Hybrid)
2 Months ago
Insight Software - Manager, Engineering (.Net/C#, SQL, AI and some frontend)

Insight Software

Bengaluru, Karnataka, India (On-Site)
1 Month ago
Valeo - Junior R&D Engineer - Aftermarket (Paid Internship)

Valeo

Czechowice-Dziedzice, Silesian Voivodeship, Poland (On-Site)
2 Months ago
Google - Software Engineer, Ads, AI/ML

Google

Los Angeles, California, United States (On-Site)
4 Weeks ago
Apple - Machine Learning Engineer, Large Visual Generative Model Optimization

Apple

Sunnyvale, California, United States (On-Site)
2 Months ago
oportun - Senior Software ML Engineer

oportun

India (Remote)
3 Months ago
Mapbox - Machine Learning Engineer III

Mapbox

United Kingdom (Remote)
1 Month ago

Get notifed when new similar jobs are uploaded

About The Company

San Francisco, California, United States (Hybrid)

San Francisco, California, United States (Hybrid)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

London, England, United Kingdom (On-Site)

View All Jobs

Get notified when new jobs are added by Scale AI

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug