SEAL Research Scientist, Agent Robustness

3 Months ago • 3 Years + • Research Development • $176,000 PA - $300,000 PA

Job Summary

Job Description

As a Research Scientist working on Agent Robustness, you will tackle the core challenges of building safe and human-aligned AI agents. Your responsibilities include researching AI agent capabilities, designing benchmarks, and creating harnesses to test for harmful actions. You'll also design exploits and mitigations for emerging failure modes and characterize risks in multi-agent systems. This role involves collaborative research, interpreting literature, and prototyping new ideas in frameworks like Pytorch, Jax, or Tensorflow. The work contributes to the safety and transparency of AI.
Must have:
  • Commitment to safe and trustworthy AI deployments.
  • Practical experience in collaborative technical research.
  • Proficiency in frameworks like Pytorch, Jax, or Tensorflow.
  • Track record of published research in machine learning.
  • At least three years of experience in ML problem-solving.
  • Strong written and verbal communication skills.
Good to have:
  • Hands-on experience with open source LLM fine-tuning.
  • Experience in crafting evaluations or data science roles.
  • Experience working with cloud technology stack (AWS or GCP).

Job Details

As the leading data and evaluation partner for frontier AI companies, Scale plays an integral role in understanding the capabilities and safeguarding large language models (LLMs). Safety, Evaluations and Alignment Lab (SEAL) is Scale’s frontier research effort dedicated to tackling the challenging research problems in evaluation, red teaming, and alignment of advanced AI systems.

We are actively seeking talented researchers to join us in shaping the landscape for safety and transparency for the entire AI industry. We support collaborations across the industry and academia and the publication of our research findings. 

As a Research Scientist working on Agent Robustness you will work on the fundamental challenges of building AI agents that are safe and aligned with humans. For example, you might: 

  • Research the science of AI agent capabilities and methodologies for benchmarking them;
  • Design and build harnesses to test AI agents’ tendency to take harmful actions when pressured to do so by users or tricked into doing so by elements of their environment;
  • Design and build exploits and mitigations for new and unique failure modes that arise as AI agents gain affordances like coding, web browsing, and computer use;
  • Characterize and design mitigations for potential failure modes or broader risks of systems involving multiple interacting AI agents.

Ideally you’d have:

  • Commitment to our mission of promoting safe, secure, and trustworthy AI deployments in the industry as frontier AI capabilities continue to advance.
  • Practical experience conducting technical research collaboratively, with proficiency in frameworks like Pytorch, Jax, or Tensorflow. You should also be adept at interpreting research literature and quickly turning new ideas into prototypes.
  • A track record of published research in machine learning, particularly in generative AI.
  • At least three years of experience addressing sophisticated ML problems, whether in a research setting or in product development.
  • Strong written and verbal communication skills to operate in a cross functional team.

Nice to have:

  • Hands-on experience with open source LLM fine-tuning or involvement in bespoke LLM fine-tuning projects using Pytorch/Jax.
  • Experience in crafting evaluations or a background in data science roles related to LLM technologies.
  • Experience working with cloud technology stack (eg. AWS or GCP) and developing machine learning models in a cloud environment.

Our research interviews are crafted to assess candidates' skills in practical ML prototyping and debugging, their grasp of research concepts, and their alignment with our organizational culture. We will not ask any LeetCode-style questions.

Similar Jobs

WebFX - Jr. Digital Public Relations Specialist

WebFX

Harrisburg, Pennsylvania, United States (On-Site)
9 Months ago
PwC - Industry Account Driver

PwC

Makati City, Metro Manila, Philippines (On-Site)
10 Months ago
Larian Studios - Senior Technical Rigger

Larian Studios

Kuala Lumpur, Federal Territory Of Kuala Lumpur, Malaysia (On-Site)
8 Months ago
playrix  - Customer Support Representative (Japanese)

playrix

Serbia (Remote)
9 Months ago
bohemia interactive - Programmer Modding Quality

bohemia interactive

Prague, Prague, Czechia (On-Site)
6 Months ago
Instrumental - AI Grantwriting Associate

Instrumental

(Remote)
3 Months ago
Glean - AI Outcomes Manager

Glean

United States (Remote)
2 Months ago
Zazz - Artificial Intelligence Engineer

Zazz

(Remote)
5 Months ago
Western Digital - Intern - R&D Development

Western Digital

Phra Nakhon Si Ayutthaya, Thailand (On-Site)
3 Weeks ago
C3 IoT - AI Solution Manager/Senior AI Solution Manager (Federal - Intelligence Community)

C3 IoT

Tysons, Virginia, United States (On-Site)
3 Weeks ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

GoTo Group - Junior Legal Officer

GoTo Group

Jakarta, Indonesia (On-Site)
3 Months ago
Funko - Senior Integrations Engineer

Funko

North Carolina, United States (Remote)
1 Month ago
Glean - Solutions Engineer - East

Glean

(Remote)
8 Months ago
gitlab - Commercial Account Executive, Named Accounts

gitlab

Canada (Remote)
2 Months ago
Gunzilla - Lead Technical Designer

Gunzilla

Kyiv, Kyiv City, Ukraine (On-Site)
4 Months ago
Tencent - Marketing Manager

Tencent

Shenzhen, Guangdong Province, China (On-Site)
5 Months ago
ISS Stoxx - Software Development Lead

ISS Stoxx

Mumbai, Maharashtra, India (On-Site)
3 Months ago
Balbix - Staff /Sr Staff/ Principal Engineer - Lakehouse

Balbix

Gurugram, Haryana, India (On-Site)
9 Months ago
Marsh McLennan - Program Facilitator, Resilience

Marsh McLennan

Adelaide, South Australia, Australia (Hybrid)
2 Months ago
Epic Games - Senior Tools Programmer

Epic Games

Vancouver, British Columbia, Canada (On-Site)
6 Months ago

Get notifed when new similar jobs are uploaded

Jobs in San Francisco, California, United States

Roblox - Manager, Special Investigations

Roblox

Washington, District Of Columbia, United States (On-Site)
1 Month ago
bytedance - Content Strategy Operations - Lemon8

bytedance

Los Angeles, California, United States (On-Site)
4 Months ago
Activision - Senior Sales Account Executive

Activision

New York, New York, United States (Hybrid)
3 Months ago
Postman - Senior Engineer Manager, Postman Insights

Postman

San Francisco, California, United States (Hybrid)
3 Months ago
Apple - 5G/4G Cellular Layer1 Control Firmware Engineer

Apple

San Diego, California, United States (On-Site)
1 Month ago
Apple - Software Engineer, Payments

Apple

Cupertino, California, United States (On-Site)
2 Months ago
zeta - VP Sales - Strategic Accounts

zeta

New York, United States (On-Site)
3 Months ago
dun bradstreet - Sales Administrative Assistant

dun bradstreet

Jacksonville, Florida, United States (On-Site)
3 Months ago
world relief - Vision & Call Intern - Limited Term

world relief

Kent, Washington, United States (Hybrid)
4 Months ago
Adyen - Senior Partner Solutions Engineer

Adyen

New York, United States (On-Site)
4 Weeks ago

Get notifed when new similar jobs are uploaded

Research Development Jobs

Unity - Senior Machine Learning Developer

Unity

Montreal, Quebec, Canada (Remote)
3 Weeks ago
Alpha Sense - Channel and Customer Research Analyst

Alpha Sense

New York, United States (Remote)
2 Months ago
Canva - Staff Backend Engineer - Canva AI

Canva

Brisbane, Queensland, Australia (Remote)
3 Weeks ago
Apple - Machine Learning Engineer, Siri Automatic Speech Recognition

Apple

Cambridge, Massachusetts, United States (On-Site)
2 Months ago
USE Insider - Machine Learning Engineer (Search)

USE Insider

Istanbul, İstanbul, Türkiye (Remote)
2 Months ago
gismart - Machine Learning Engineer

gismart

(On-Site)
4 Weeks ago
bytedance - Research Scientist Graduate (eCommerce Recommendation)

bytedance

Seattle, Washington, United States (On-Site)
1 Month ago
Diligent Corporation - IAG Research Analyst

Diligent Corporation

Guadalajara, Jalisco, Mexico (On-Site)
3 Months ago
Haleon - Head of R&D China

Haleon

Shanghai, China (On-Site)
12 Months ago
Apple - Computer Vision Engineer, Machine Learning

Apple

Cupertino, California, United States (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded

About The Company

San Francisco, California, United States (Hybrid)

San Francisco, California, United States (Hybrid)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

London, England, United Kingdom (On-Site)

View All Jobs

Get notified when new jobs are added by Scale AI

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug