Senior / Principal Inference Engineer - ML Platform

1 Month ago • 4 Years + • Devops • $273,070 PA - $322,170 PA

Job Summary

Job Description

Roblox is seeking a Senior / Principal Inference Engineer to build the next generation of ML Ecosystem Tooling, focusing on model inference. The role involves setting technical strategy, overseeing development of high-scale, reliable infrastructure for large-scale inference, and optimizing performance across the inference stack. The engineer will stay updated on industry trends, bootstrap and maintain infrastructure components like the Serving Layer, Metadata Store, Model Registry, and Pipeline Orchestrator, and partner with other organizations to enhance the ML@Roblox platform. This position requires a strong background in building complex distributed systems for real-time ML inference serving.
Must have:
  • 4+ years of professional experience
  • System design experience
  • Build scalable, reliable platforms
  • Complex distributed systems
  • Real-time ML inference serving
  • Low latency, high throughput inference
  • Bachelor's degree in CS or related field
  • Support internal partners
  • Fix weaknesses in systems
Good to have:
  • Experience with recommendation systems
  • Familiar with Triton Inference Server
  • Familiar with TensorRT
  • Familiar with KServe

Job Details

Every day, tens of millions of people come to Roblox to explore, create, play, learn, and connect with friends in 3D immersive digital experiences– all created by our global community of developers and creators. 

At Roblox, we’re building the tools and platform that empower our community to bring any experience that they can imagine to life. Our vision is to reimagine the way people come together, from anywhere in the world, and on any device. We’re on a mission to connect a billion people with optimism and civility, and looking for amazing talent to help us get there. 

A career at Roblox means you’ll be working to shape the future of human interaction, solving unique technical challenges at scale, and helping to create safer, more civil shared experiences for everyone.

As a Senior / Principal Inference Engineer on ML Platform you will build the next generation of ML Ecosystem Tooling, specifically around model inference. ML Platform today supports billions of requests per day across our homepage, marketplace, economy, and more. We are looking for accomplished engineers to help build out the next generation of ML platform tooling for high-scale inference in a quickly innovating space.

You Will:

  • Set technical strategy and oversee development of high scale, reliable infrastructure systems for large-scale inference, especially as we scale up both inference qps and model size.
  • Dig into performance bottlenecks all along the inference stack, spanning from model optimizations to infrastructure optimizations.
  • Stay abreast of industry trends in machine learning and infrastructure to ensure the adoption of leading-edge technologies and practices.
  • Bootstrap and maintain infrastructure for ML Platform components—Serving Layer, Metadata Store, Model Registry, and Pipeline Orchestrator.
  • Partner across organizations to build tooling, interfaces, and visualizations that make the ML@Roblox a delight to use.

You Have:

  • 4+ years of professional experience and a tool chest of system design experience upon which to draw to build scalable, reliable platforms for all of Roblox.
  • Experience building complex distributed systems that scale to real-time ML inference serving, ideally for real-time recommendation systems serving millions of QPS.
  • Experience debugging complicated infrastructure-level performance issues to enable low latency, high throughput inference..
  • Bachelor's degree or higher in Computer Science, Computer Engineering, Data Science, or a similar technical field.

You Are:

  • Passionate about supporting and working cross functionally with internal partners (Data Scientists and ML Engineers) to meet and understand their needs.
  • A reliability nut: you love digging into tricky postmortems and identifying and fixing weaknesses in complicated systems.
  • Ideally familiar with ML model inference frameworks like Triton Inference Server, TensorRT, KServe.

For roles that are based at our headquarters in San Mateo, CA: The starting base pay for this position is as shown below. The actual base pay is dependent upon a variety of job-related factors such as professional background, training, work experience, location, business needs and market demand. Therefore, in some circumstances, the actual salary could fall outside of this expected range. This pay range is subject to change and may be modified in the future. All full-time employees are also eligible for equity compensation and for benefits as described on this page.

Annual Salary Range
$273,070$322,170 USD

Roles that are based in our San Mateo, CA Headquarters are in-office Tuesday, Wednesday, and Thursday, with optional in-office on Monday and Friday (unless otherwise noted).

Roblox provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws. Roblox also provides reasonable accommodations for all candidates during the interview process.

Similar Jobs

Apple - SoC Physical Design Engineer, Electrical Analysis

Apple

Sunnyvale, California, United States (On-Site)
1 Month ago
Betadwarf - Gameplay Engineer

Betadwarf

Copenhagen, Denmark (Hybrid)
1 Week ago
WebTech Corporation - HR Systems Senior Analyst

WebTech Corporation

Pittsburgh, Pennsylvania, United States (On-Site)
1 Month ago
Assystems - Senior Consultant (P6 + Unifier + Aconex)

Assystems

Gurugram, Haryana, India (On-Site)
8 Months ago
Illumina - Staff Automation Engineer

Illumina

Foster City, California, United States (On-Site)
1 Month ago
Globalization Partners - Principal Software Engineer (full stack, Node.js, TypeScript, React.js, AWS)

Globalization Partners

Ireland (Remote)
1 Month ago
bytedance - Backend Engineer - Applied Machine Learning Platform

bytedance

Singapore (On-Site)
8 Months ago
bytedance - Regional Head of Solution Architect, Cloud Security

bytedance

Singapore (On-Site)
4 Months ago
Qualcomm - Automotive - Platform Software Engineer

Qualcomm

San Diego, California, United States (On-Site)
1 Month ago
bytedance - Technical Support Engineer, Video Cloud

bytedance

Singapore (On-Site)
8 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

digital eclipse entertainment  - Video Game Producer

digital eclipse entertainment

Emeryville, California, United States (On-Site)
2 Months ago
Ajmera Infotech - Flutter Application Developer (Mobile)

Ajmera Infotech

Hyderabad, Telangana, India (On-Site)
3 Months ago
fuse games - Rendering Engineer

fuse games

Guildford, England, United Kingdom (Hybrid)
2 Weeks ago
bytedance - Network Software Development Engineer, Switch

bytedance

San Jose, California, United States (On-Site)
3 Months ago
OKX - Data Engineer

OKX

Singapore, Singapore (On-Site)
9 Months ago
Capgemini - Network Voice

Capgemini

Hyderabad, Telangana, India (On-Site)
2 Months ago
Tesla - Autopilot Test Specialist

Tesla

Santa Oliva, Catalunya, Spain (On-Site)
5 Months ago
Super.com - Senior Software Engineer - Full-Stack

Super.com

Canada (Remote)
3 Months ago
Ariens Company - Senior Manufacturing Engineer

Ariens Company

Fayetteville, Tennessee, United States (On-Site)
1 Day ago
hogarth - Senior Content Manager

hogarth

Sunnyvale, California, United States (Hybrid)
2 Months ago

Get notifed when new similar jobs are uploaded

Jobs in San Mateo, California, United States

Clearwater Analytics - Reconciliation Analyst

Clearwater Analytics

Boise, Idaho, United States (On-Site)
1 Year ago
2K - Principal Data Architect, Data Governance

2K

Austin, Texas, United States (On-Site)
2 Months ago
LightForce Orthodontics - Orthodontic Integration Specialist

LightForce Orthodontics

Seattle, Washington, United States (On-Site)
1 Week ago
HCL Tech - Senior Developer embedded Linux, Python

HCL Tech

Michigan, United States (On-Site)
1 Month ago
IGT - Customer Training Instructor I

IGT

Raleigh, North Carolina, United States (Hybrid)
1 Month ago
Rockstar Games - Senior Product Manager, Commerce

Rockstar Games

New York, United States (On-Site)
1 Month ago
Apple - Data Engineer

Apple

Cupertino, California, United States (On-Site)
3 Weeks ago
Next Level Business Services - Visual Analytics Architect

Next Level Business Services

Atlanta, Georgia, United States (On-Site)
9 Years ago
Zeeco, Inc. - Quality Engineer

Zeeco, Inc.

Broken Arrow, Oklahoma, United States (On-Site)
4 Months ago
Ethos Life - Finance & Strategy, Associate / Senior Associate

Ethos Life

United States (Remote)
1 Week ago

Get notifed when new similar jobs are uploaded

Devops Jobs

Rackspace Technology - GCP Architect

Rackspace Technology

Gurugram, Haryana, India (Hybrid)
3 Weeks ago
Brillio - Informatica Intelligent Cloud Services (IICS) Developer

Brillio

McLean, Virginia, United States (On-Site)
1 Week ago
Match Group - Senior Platform Engineer

Match Group

New York, New York, United States (Hybrid)
9 Months ago
Palo Alto Networks - Marketplace Operations Manager (Cloud Service Providers)

Palo Alto Networks

Paris, Île-de-France, France (On-Site)
2 Months ago
Apple - Senior Gaming Platform Engineer

Apple

Seattle, Washington, United States (On-Site)
1 Week ago
Scale AI - Software Engineer (Infrastructure)

Scale AI

Doha, Doha Municipality, Qatar (On-Site)
2 Months ago
Enverus - Senior Site Reliability Engineer

Enverus

Brno, South Moravian Region, Czechia (Hybrid)
2 Months ago
PhonePe - Site Reliability Engineer - CDN

PhonePe

Bengaluru, Karnataka, India (On-Site)
8 Months ago
Palo Alto Networks - Marketplace Operations Manager (Cloud Service Providers)

Palo Alto Networks

Amsterdam, North Holland, Netherlands (On-Site)
2 Months ago
Nice - Senior Solution Architect

Nice

Southampton, England, United Kingdom (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

About The Company

San Mateo, California, United States (On-Site)

San Mateo, California, United States (On-Site)

Washington, District Of Columbia, United States (Hybrid)

San Mateo, California, United States (On-Site)

Gurugram, Haryana, India (Hybrid)

San Mateo, California, United States (On-Site)

San Mateo, California, United States (On-Site)

San Mateo, California, United States (On-Site)

San Mateo, California, United States (Hybrid)

View All Jobs

Get notified when new jobs are added by Roblox

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug