Research Engineer - Post Training

Turing

| Palo Alto, California, United States (On Site) | Full Time | 1 day ago

Apply Now

Job Summary

This role sits at the intersection of data scaling, post-training, and reinforcement learning. You’ll work on core algorithms and pipelines that transform pre-trained models into steerable, versatile systems capable of solving real-world problems. From designing RL environments to crafting data mixtures and reward models, your work will power both cutting-edge research and production-grade AI. This is a rare opportunity to work at the frontier of model training—owning the full loop from data generation and reward modeling to reinforcement learning and evaluation. Your work will directly accelerate our mission to develop AI systems that reason, code, and generate new knowledge.

Must Have

Design and implement large-scale data pipelines for RL, SFT, and post-training workflows
Develop and iterate on simulated or interactive environments to train, evaluate, and stress-test agents
Advance reinforcement learning algorithms and generalizable reward models
Define and improve data quality and evaluation metrics
Implement scalable model evaluation frameworks
Strong software engineering and systems-building skills
Deep understanding of machine learning and fine-tuning of large language models (LLMs)
Hands-on experience improving model behavior through data-driven methods and reinforcement learning (SFT, PPO, DPO, or similar)
Familiarity with large-scale data generation and evaluation pipelines
Experience developing benchmarks and evaluation metrics for reasoning, coding, or multi-agent systems
Proven ability to design or optimize models for complex challenges

Good to Have

Experience building or scaling infrastructure for large-scale RL or post-training

Perks & Benefits

Amazing work culture (Super collaborative & supportive work environment; 5 days a week)
Awesome colleagues (Surround yourself with top talent from Meta, Google, LinkedIn etc. as well as people with deep startup experience)
Competitive compensation
Flexible working hours

Job Description

About Turing

Based in San Francisco, California, Turing is the world’s leading research accelerator for frontier AI labs and a trusted partner for global enterprises looking to deploy advanced AI systems. Turing accelerates frontier research with high-quality data, specialized talent, and training pipelines that advance thinking, reasoning, coding, multimodality, and STEM. For enterprises, Turing builds proprietary intelligence systems that integrate AI into mission-critical workflows, unlock transformative outcomes, and drive lasting competitive advantage.

Recognized by Forbes, The Information, and Fast Company among the world’s top innovators, Turing’s leadership team includes AI technologists from Meta, Google, Microsoft, Apple, Amazon, McKinsey, Bain, Stanford, Caltech, and MIT. Learn more at www.turing.com

About the Role

Data is the lifeblood of advanced AI. Our R&D Research Engineers build the foundational systems that generate, refine, and evaluate high-quality data at unprecedented scale—directly fueling the next generation of reasoning and coding agents.

This role sits at the intersection of data scaling, post-training, and reinforcement learning. You’ll work on core algorithms and pipelines that transform pre-trained models into steerable, versatile systems capable of solving real-world problems. From designing RL environments to crafting data mixtures and reward models, your work will power both cutting-edge research and production-grade AI.

What You’ll Work On

Design and implement large-scale data pipelines for RL, SFT, and post-training workflows to create and refine high-quality datasets.
Develop and iterate on simulated or interactive environments to train, evaluate, and stress-test reasoning and coding agents.
Advance reinforcement learning algorithms and generalizable reward models to improve model reasoning, coding, and decision-making.
Define and improve data quality and evaluation metrics to ensure models are learning from the best possible signals.
Implement scalable model evaluation frameworks to measure progress in reasoning, code generation, and agentic capabilities.
Collaborate cross-functionally with research, data, multimodal, and product teams to bring cutting-edge research into real-world impact.

What We’re Looking For

Strong software engineering and systems-building skills.
Deep understanding of machine learning and fine-tuning of large language models (LLMs).
Hands-on experience improving model behavior through data-driven methods and reinforcement learning (SFT, PPO, DPO, or similar).
Familiarity with large-scale data generation and evaluation pipelines.
Experience developing benchmarks and evaluation metrics for reasoning, coding, or multi-agent systems.
Proven ability to design or optimize models for complex challenges such as multi-modality, long-context reasoning, or multi-agent orchestration.
(Bonus) Experience building or scaling infrastructure for large-scale RL or post-training.

Why This Role Matters

This is a rare opportunity to work at the frontier of model training—owning the full loop from data generation and reward modeling to reinforcement learning and evaluation. Your work will directly accelerate our mission to develop AI systems that reason, code, and generate new knowledge.

Salary Range: $170K – $220K + Equity + Bonus

Values:

We are client first: We put our clients at the center of everything we do, because their success is the ultimate measure of our value.
We work at Start-Up Speed: We move fast, stay agile and favor action because momentum is the foundation of perfection
We are Al forward: We help our clients build the future of Al and implement it in our own roles and workflow to amplify productivity.

Advantages of joining Turing:

Amazing work culture (Super collaborative & supportive work environment; 5 days a week)
Awesome colleagues (Surround yourself with top talent from Meta, Google, LinkedIn etc. as well as people with deep startup experience)
Competitive compensation
Flexible working hours

Don’t meet every single requirement? Studies have shown that women and people of color are less likely to apply to jobs unless they meet every single qualification. Turing is proud to be an equal opportunity employer. We do not discriminate on the basis of race, religion, color, national origin, gender, gender identity, sexual orientation, age, marital status, disability, protected veteran status, or any other legally protected characteristics. At Turing we are dedicated to building a diverse, inclusive and authentic workplace and celebrate authenticity, so if you’re excited about this role but your past experience doesn’t align perfectly with every qualification in the job description, we encourage you to apply anyways. You may be just the right candidate for this or other roles.

For applicants from the European Union, please review Turing's GDPR notice here.

Create a Job Alert

Interested in building your career at Turing? Get future opportunities sent straight to your email.

Create alert

Apply for this job

indicates a required field

Autofill with MyGreenhouse

First Name*

Last Name*

Email*

Phone

Country

Phone

Resume/CV

AttachAttach

Dropbox

Google Drive

Enter manuallyEnter manually

Accepted file types: pdf, doc, docx, txt, rtf

Cover Letter

AttachAttach

Dropbox

Google Drive

Enter manuallyEnter manually

Accepted file types: pdf, doc, docx, txt, rtf

---

LinkedIn Profile

Website

Your country of residence*

Please enter the name of the country you are legally registered to work in.

Will you require sponsorship or other assistance or support to be authorized to, or otherwise, work from your country of residence?*

Select...

Your authorization to work in the country where you live.*

Select...

Please choose the option that most closely describes your work authorization.

Submit application

6 Skills Required For This Role

Team Management Game Texts Agile Development Reinforcement Learning Algorithms Machine Learning