Member of Technical Staff - Post-Training

23 Minutes ago • All levels
Software Development & Engineering

Job Description

Reflection's mission is to build open superintelligence accessible to all by developing open-weight models. This role involves building systems to transform powerful pre-trained models into aligned and general agents. You will drive research and engineering initiatives in post-training, from data curation to large-scale optimization, developing data generation pipelines, reward models, reinforcement learning algorithms, and inference-time scaling techniques. Collaboration across pre-training and post-training teams is key to delivering significant gains in model capability and shaping our understanding of how large models learn to reason and follow instructions.
Must Have:
  • Deep understanding of ML fundamentals and practical experience with large-scale LLM training.
  • Strong engineering skills, comfortable diving into complex ML codebases and distributed systems.
  • Experience improving model behavior through data, reward modeling, or RL techniques.
  • Evidence of owning ambitious research or engineering agendas that led to measurable model improvements.
  • Ability to thrive in a fast-paced, high-agency startup environment with a bias toward action and clarity of execution.
  • Able to work fluidly across research and infrastructure boundaries.
  • Strong communication capabilities and comfort working collaboratively.
  • Passionate about advancing the frontier of intelligence.
Perks:
  • Top-tier compensation including salary and equity.
  • Comprehensive medical, dental, vision, life, and disability insurance.
  • Fully paid parental leave for all new parents, including adoptive and surrogate journeys.
  • Financial support for family planning.
  • Paid time off when needed.
  • Relocation support.
  • Daily lunch and dinner provided.
  • Regular off-sites and team celebrations.

Add these skills to join the top 1% applicants for this job

game-texts
reinforcement-learning
algorithms
machine-learning

Our Mission

Reflection’s mission is to build open superintelligence and make it accessible to all.

We’re developing open weight models for individuals, agents, enterprises, and even nation states. Our team of AI researchers and company builders come from DeepMind, OpenAI, Google Brain, Meta, Character.AI, Anthropic and beyond.

About the Role

  • Build systems that transform powerful pre-trained models into aligned and general agents.
  • Drive research and engineering initiatives that push the frontier of post-training, from data curation to large-scale optimization.
  • Develop data generation pipelines, reward models, reinforcement learning algorithms, and inference-time scaling techniques.
  • Collaborate across pre-training and post-training teams to deliver step-function gains in model capability.
  • Contribute to shaping our understanding of how large models learn to reason, follow instructions, and improve through reinforcement learning.

About You

  • Deep understanding of machine learning fundamentals and practical experience with large-scale LLM training.
  • Strong engineering skills, comfortable diving into complex ML codebases and distributed systems.
  • Experience improving model behavior through data, reward modeling, or RL techniques.
  • Evidence of owning ambitious research or engineering agendas that led to measurable model improvements.
  • Thrive in a fast-paced, high-agency startup environment; bias toward action and clarity of execution.
  • Able to work fluidly across research and infra boundaries
  • Strong communication capabilities and comfort working collaboratively
  • Passionate about advancing the frontier of intelligence.

What We Offer:

We believe that to build superintelligence that is truly open, you need to start at the foundation. Joining Reflection means building from the ground up as part of a small talent-dense team. You will help define our future as a company, and help define the frontier of open foundational models.

We want you to do the most impactful work of your career with the confidence that you and the people you care about most are supported.

  • Top-tier compensation: Salary and equity structured to recognize and retain the best talent globally.
  • Health & wellness: Comprehensive medical, dental, vision, life, and disability insurance.
  • Life & family: Fully paid parental leave for all new parents, including adoptive and surrogate journeys. Financial support for family planning.
  • Benefits & balance: paid time off when you need it, relocation support, and more perks that optimize your time.
  • Opportunities to connect with teammates: lunch and dinner are provided daily. We have regular off-sites and team celebrations.

Set alerts for more jobs like Member of Technical Staff - Post-Training
Set alerts for new jobs by Reflection AI
Set alerts for Software Development & Engineering (Remote) jobs
Contact Us
hello@outscal.com
Made in INDIA 💛💙