Research Scientist, Reinforcement Learning

1 Month ago • 5 Years + • Research Development • $250,000 PA - $290,000 PA

Job Summary

Job Description

Fireworks is building the future of generative AI infrastructure, offering the highest-quality models and the fastest, most scalable inference. As a Research Scientist focused on Reinforcement Learning (RL), you will push the boundaries of how large language models are trained, aligned, and deployed. This role involves designing algorithms, building training pipelines, and running experiments, focusing on scalable RLHF alternatives like GRPO and DPO, reward modeling, and agent-based training. Your contributions will directly impact model quality, training workflows, and customer-facing APIs, requiring collaboration with researchers, engineers, and product teams to translate cutting-edge RL into practical systems for LLM deployment.
Must have:
  • 5+ years of research experience in reinforcement learning
  • Strong understanding of RL fundamentals
  • Experience with reinforcement fine-tuning of LLMs
  • Experience building and training deep learning models using PyTorch
  • Proficiency in Python and clean code writing
  • Ability to lead RL experiments from idea to analysis
  • Excellent communication and collaboration skills
Good to have:
  • PhD in Computer Science, Machine Learning, or related field
  • Publications at top-tier ML conferences
  • Experience building interactive agents
  • Expertise in reward modeling and LLM evaluation
Perks:
  • Meaningful equity in a fast-growing startup
  • Competitive salary
  • Comprehensive benefits package
  • Opportunity to solve hard problems at the forefront of AI infrastructure
  • Work with bleeding-edge technology
  • Ownership and direct impact
  • Learn from world-class engineers and AI researchers

Job Details

About Us:

Here at Fireworks, we’re building the future of generative AI infrastructure. Fireworks offers the generative AI platform with the highest-quality models and the fastest, most scalable inference. We’ve been independently benchmarked to have the fastest LLM inference and have been getting great traction with innovative research projects, like our own function calling and multi-modal models. Fireworks is funded by top investors, like Benchmark and Sequoia, and we’re an ambitious, fun team composed primarily of veterans from Pytorch and Google Vertex AI.

The Role:

As a Research Scientist focused on Reinforcement Learning (RL), you’ll apply your deep expertise in the field to push the boundaries of how large language models are trained, aligned, and deployed. We’re looking for someone with a strong foundation in RL - not just familiarity, but hands-on experience designing algorithms, building training pipelines, and running experiments.

You’ll work on everything from scalable RLHF alternatives (e.g., GRPO, DPO) to reward modeling and agent-based training. Your contributions will directly impact Fireworks’ model quality, training workflows, and customer-facing APIs. You’ll also collaborate with researchers, engineers, and product teams to translate state-of-the-art RL into practical systems used by companies deploying LLMs at scale.

Key Responsibilities:

  • Design, implement, and optimize reinforcement learning algorithms to improve the training and alignment of large language models.
  • Develop scalable pipelines for reinforcement learning from human feedback (RLHF) and explore alternatives such as GRPO and DPO.
  • Conduct hands-on experiments across reward modeling, agent-based training, and reinforcement fine-tuning of LLMs.
  • Collaborate with cross-functional teams, including researchers, engineers, and product managers, to integrate cutting-edge RL advancements into production systems.
  • Analyze experimental results and iterate quickly to improve model performance and training workflows.
  • Contribute to the development of Fireworks’ customer-facing APIs by enhancing model alignment and real-world usability.
  • Stay current with the latest research in reinforcement learning, LLM alignment, and AI safety to inform and inspire new initiatives.

Minimum Qualifications:

  • 5+ years of research experience specifically in reinforcement learning
  • Strong understanding of RL fundamentals, including policy gradients, actor-critic methods, offline RL, and preference-based learning
  • Experience with reinforcement fine-tuning of LLMs (e.g., PPO, DPO, GRPO)
  • Experience building and training deep learning models using PyTorch
  • Proficiency in Python and ability to write clean, efficient, research-grade code
  • Demonstrated ability to lead RL experiments from idea to implementation and analysis
  • Excellent communication skills and the ability to collaborate in fast-paced, cross-functional environments

Preferred Qualifications:

  • PhD in Computer Science, Machine Learning, Applied Mathematics, or a related field
  • Publications at top-tier ML conferences (NeurIPS, ICML, ICLR, etc.)
  • Experience building interactive agents that leverage tools, APIs, or search
  • Expertise in reward modeling and LLM evaluation strategies

Total compensation for this role also includes meaningful equity in a fast-growing startup, along with a competitive salary and comprehensive benefits package. Base salary is determined by a range of factors including individual qualifications, experience, skills, interview performance, market data, and work location. The listed salary range is intended as a guideline and may be adjusted.

Base Pay Range (Plus Equity)

$250,000 - $290,000 USD

Why Fireworks AI?

  • Solve Hard Problems: Tackle challenges at the forefront of AI infrastructure, from low-latency inference to scalable model serving.
  • Build What’s Next: Work with bleeding-edge technology that impacts how businesses and developers harness AI globally.
  • Ownership & Impact: Join a fast-growing, passionate team where your work directly shapes the future of AI—no bureaucracy, just results.
  • Learn from the Best: Collaborate with world-class engineers and AI researchers who thrive on curiosity and innovation.

Fireworks AI is an equal-opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all innovators.

Similar Jobs

Findhelp - Senior Staff AI Engineer

Findhelp

Austin, Texas, United States (On-Site)
1 Month ago
Monzo - Business & Partner Operations Manager

Monzo

San Francisco, California, United States (Hybrid)
1 Month ago
Moloco - Senior Data Scientist, Growth Analytics

Moloco

London, England, United Kingdom (On-Site)
3 Months ago
Ubisoft - The Division Resurgence - Tools Developer

Ubisoft

Paris, Île-de-France, France (Hybrid)
4 Months ago
Aledade - Director of AI Transformation

Aledade

Arlington, Virginia, United States (Remote)
1 Month ago
Rocket studio - AI (Intern)

Rocket studio

Hanoi, Hanoi, Vietnam (On-Site)
3 Months ago
Philips - Senior R&D Project Leader-Pipeline

Philips

Suzhou, Jiangsu, China (On-Site)
3 Months ago
Glean - Senior/Staff Applied Scientist

Glean

Palo Alto, California, United States (Hybrid)
3 Months ago
Marvell - AI/ML Software Developer Engineer

Marvell

Barcelona, Catalonia, Spain (On-Site)
2 Months ago
Ansys - Senior R&D Engineer (ECAD)

Ansys

Austin, Texas, United States (On-Site)
3 Weeks ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Dave Ramsey - Director of Project Management

Dave Ramsey

Franklin, Tennessee, United States (On-Site)
2 Months ago
extreme network - Account Executive

extreme network

Munich, Bavaria, Germany (Remote)
4 Months ago
Yahoo - Features & Projects Editor

Yahoo

United States (Hybrid)
1 Month ago
Expedia - Advanced Data Insights Analyst III (Marketing Analytics)

Expedia

London, England, United Kingdom (On-Site)
2 Months ago
Crunchyroll - Staff Software Engineer e-commerce

Crunchyroll

Hyderabad, Telangana, India (On-Site)
9 Months ago
whoop - Staff Electrical Engineer (NPI)

whoop

Boston, Massachusetts, United States (On-Site)
4 Months ago
Litmus - Senior Software Engineer - Golang

Litmus

Pune, Maharashtra, India (Remote)
3 Months ago
Winzo - Public Policy

Winzo

New Delhi, Delhi, India (On-Site)
1 Month ago
Autodesk - Engineering Manager (Java/Nodejs, AWS)

Autodesk

Pune, Maharashtra, India (On-Site)
2 Months ago
Windranger - Product Design Lead

Windranger

Singapore, Singapore (Remote)
4 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Redwood City, California, United States

Gupta Media - Front Office Coordinator

Gupta Media

Boston, Massachusetts, United States (On-Site)
4 Months ago
Simcorp - Senior Principal Sales Manager

Simcorp

New York, United States (Hybrid)
2 Months ago
Hawkeye Innovations - Sports Systems Technician, B1G Replay (CHI)

Hawkeye Innovations

Chicago, Illinois, United States (On-Site)
4 Months ago
JMA - Technician II, Sheet Metal Fabricator

JMA

Liverpool, New York, United States (On-Site)
10 Months ago
Scale AI - Machine Learning Research Scientist/ Engineer, Agents

Scale AI

San Francisco, California, United States (On-Site)
3 Months ago
Philips - Sales, Territory Manager - VeriSight 3D/4D ICE (Intracardiac Echo)

Philips

Miami, Florida, United States (On-Site)
2 Months ago
Synthesia - Customer Onboarding Manager

Synthesia

New York, United States (Hybrid)
1 Month ago
StartPlaying - Full Stack Senior Software Engineer

StartPlaying

United States (Remote)
1 Month ago
Intangible - UI Engineer

Intangible

United States (Remote)
1 Month ago
Open Systems Technologies - Customer Service Representative

Open Systems Technologies

Tucson, Arizona, United States (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Research Development Jobs

Tencent - NLP Research Intern

Tencent

(On-Site)
7 Months ago
C3 IoT - Software Engineer - Generative AI

C3 IoT

Guadalajara, Jalisco, Mexico (On-Site)
1 Month ago
Unity - Senior Machine Learning/MLOps Developer

Unity

Montreal, Quebec, Canada (On-Site)
10 Months ago
Imanage - Senior AI Software Engineer

Imanage

London, England, United Kingdom (Hybrid)
5 Months ago
Tide - Lead Machine Learning Engineer (MLOps)

Tide

Ukraine (Hybrid)
3 Weeks ago
easygo - Senior Machine Learning Operations Engineer

easygo

Melbourne, Victoria, Australia (On-Site)
3 Months ago
eBay - Senior Applied Researcher

eBay

Bengaluru, Karnataka, India (Hybrid)
1 Month ago
Sony Interactive Entertainment - Lead AI/ML Engineer (Facial and Motion Generation)

Sony Interactive Entertainment

Tokyo, Japan (On-Site)
4 Months ago
Toast - Staff Machine Learning Engineer - Voice AI

Toast

United States (Remote)
2 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Redwood City, California, United States (Hybrid)

Redwood City, California, United States (Hybrid)

Redwood City, California, United States (Hybrid)

Redwood City, California, United States (Hybrid)

New York, United States (Hybrid)

Redwood City, California, United States (Hybrid)

Redwood City, California, United States (Remote)

Redwood City, California, United States (Hybrid)

Redwood City, California, United States (Hybrid)

Redwood City, California, United States (On-Site)

View All Jobs

Get notified when new jobs are added by Fireworks AI

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug