Member of Technical Staff - RL Infrastructure [data, evals, agent]

48 Minutes ago • All levels • $180,000 PA - $440,000 PA

Research Development

Job Description

xAI is seeking experienced software engineers to create robust data pipelines, comprehensive evaluations for benchmarking LLMs, and automation frameworks to increase the productivity of researchers and engineers. The role focuses on creating and maintaining frameworks for agent, data, and model evaluation tasks, building environments for AI agents, and tools for automating common workflows. It also involves improving alerts, metrics, error handling on large scale RL jobs, refactoring existing frameworks for modularity, designing operation procedures, coding standards, and writing unit tests and CI/CD frameworks.

Good To Have:

Experience building and maintaining frameworks that are used by many engineers.
Experience in building high-performance sandboxes, virtual machines, and simulations.
Experience building full-stack apps for automating workflows and data visualization.
Experience in rapid iteration of research to production cycles.
Experience in test automation, CI/CD.

Must Have:

Creating and maintaining frameworks for agent, data, and model evaluation tasks.
Building environments for AI agents.
Tools for automating common workflows.
Improving alerts, metrics and error handling on large scale RL jobs.
Refactoring existing agent, data, eval, training frameworks for better modularity.
Designing operation procedures and coding standards to streamline the transition from small scale experimentation to large scale RL training.
Writing unit tests, CI/CD frameworks to support rapid development cycles.
Work ethic and strong prioritization skills.
Strong communication skills.

Perks:

Equity
Comprehensive medical, vision, and dental coverage
Access to a 401(k) retirement plan
Short & long-term disability insurance
Life insurance
Various other discounts and perks

Add these skills to join the top 1% applicants for this job

communication

data-analytics

talent-acquisition

cpp

game-texts

react

rust

data-visualization

ci-cd

python

typescript

system-design

About the company

The company’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who appreciate challenging themselves and thrive on curiosity. We operate with a flat organizational structure. All employees are expected to be hands-on and to contribute directly to the company’s mission. Leadership is given to those who show initiative and consistently deliver excellence. Work ethic and strong prioritization skills are important. All engineers are expected to have strong communication skills. They should be able to concisely and accurately share knowledge with their teammates.

About the Role

The company is seeking experienced software engineers to create robust data pipelines, comprehensive evaluations for benchmarking LLMs, and automation frameworks to increase the productivity of researchers and engineers.

Focus

Creating and maintaining frameworks for agent, data, and model evaluation tasks.
Building environments for AI agents.
Tools for automating common workflows.
Improving alerts, metrics and error handling on large scale RL jobs.
Refactoring existing agent, data, eval, training frameworks for better modularity.
Designing operation procedures and coding standards to streamline the transition from small scale experimentation to large scale RL training.
Writing unit tests, CI/CD frameworks to support rapid development cycles.

Ideal Experience

Experience building and maintaining frameworks that are used by many engineers.
Experience in building high-performance sandboxes, virtual machines, and simulations.
Experience building full-stack apps for automating workflows and data visualization.
Experience in rapid iteration of research to production cycles.
Experience in test automation, CI/CD.

Typical problems you will deal with

1. We have a new agentic model capability that we’d like to improve. How do we design an efficient and robust environment for the agent to perform actions in?

2. Evaluations and observability are a core part of knowing what we need to improve in our models. What new features can we add into our evaluation framework to ease the workflow of researchers & engineers and increase observability?

3. A new open-source evaluation dataset has been released and researchers would like to track our models performance on it. How should we onboard it into our internal evaluation framework?

4. Datasets have been collected that require complex pre-processing to prepare it for large-scale RL training. How do we standardize our preprocessing pipelines to minimize dataset onboarding time?

5. A researcher on the team has an idea for how to augment a dataset to produce additional training data. How should we go about creating the data augmentation pipeline?

Tech Stack

Python / Rust / C++
Typescript / React

Interview Process

After submitting your application, the team reviews your CV and statement of exceptional work. If your application passes this stage, you will be invited to a 15 minute interview (“phone interview”) during which a member of our team will ask some basic questions. If you clear the initial phone interview, you will enter the main process, which consists of four technical interviews:

1. Coding assessment in a language of your choice.

2. Two systems hands-on: Demonstrate practical skills in live problem-solving sessions that involve both system design and coding.

3. Meet the Team: Present your past exceptional work and your vision with the company to a small audience.

Our goal is to finish the main process within one week. All interviews will be conducted via Google Meet.

Annual Salary Range

$180,000 - $440,000 USD

Benefits

Base salary is just one part of our total rewards package at the company, which also includes equity, comprehensive medical, vision, and dental coverage, access to a 401(k) retirement plan, short & long-term disability insurance, life insurance, and various other discounts and perks.

The company is an equal opportunity employer.

California Consumer Privacy Act (CCPA) Notice

Set alerts for more jobs like Member of Technical Staff - RL Infrastructure [data, evals, agent]

Set alerts for new jobs by xAI

Set alerts for new Research Development jobs in United States

Set alerts for new jobs in United States

Set alerts for Research Development (Remote) jobs

More Research Development Jobs

Member of Technical Staff, Vision (Enterprise)

xAI

London, United Kingdom (On-site)

Member of Technical Staff - RL Infrastructure [data, evals, agent]

Job Summary

Job Description

12 skills required for this role

Job Details

About the company

About the Role

Focus

Ideal Experience

Typical problems you will deal with

Tech Stack

Interview Process

Annual Salary Range

Job Alerts

Similar Jobs

More Research Development Jobs

Member of Technical Staff, Vision (Enterprise)

Machine Learning Engineer, Community Notes

Member of Technical Staff, Reasoning (Alignment)

Hardcore Engineer - Pre-training Infrastructure

Member of Technical Staff, RL Training Framework

Member of Technical Staff, Search/Retrieval Expert

Member of Technical Staff, Inference

Member of Technical Staff, Image Generation

Member of Technical Staff - RL Data Scaling

Member of Technical Staff - AI Experts

More Software Development & Engineering Jobs

IT Systems Engineer

Member of Technical Staff, Grok Imagine

Senior Frontend Engineer - Autonomous Agents - Starfleet

Backend Engineer - Product

Commissioning Engineer

Member of Technical Staff, Enterprise, Forward Deployed

Frontend Engineer - Enterprise Agent (London)

Backend Engineer - Product Safety

Connectivity Systems Engineer

Member of Technical Staff, Voice (Enterprise)

xAI

Member of Technical Staff, Grok Imagine

Office Coordinator

Backend Engineer - Product

Member of Technical Staff, Enterprise, Forward Deployed

Backend Engineer - Product Safety

Site Reliability Engineer (SRE) - grok.com & API

Governance, Risk, and Compliance Lead

Network Security Engineer, X

Interior Design Lead (Workplace)

Level Up Your Career in Game Development!