AI Infrastructure Engineer, Model Serving Platform

1 Day ago • 4 Years + • $175,000 PA - $220,000 PA

Job Summary

Job Description

As a Software Engineer on the ML Infrastructure team, you will design and build platforms for scalable, reliable, and efficient serving of LLMs and AI agents. The ideal candidate combines strong ML fundamentals with deep expertise in backend system design. You’ll work in a highly collaborative environment, bridging research and engineering to deliver seamless experiences to our customers and accelerate innovation across the company. You will build and maintain fault-tolerant, high-performance systems, collaborate with researchers and engineers, conduct architecture and design reviews, develop monitoring and observability solutions, and lead projects end-to-end.
Must have:
  • 4+ years of experience building large-scale backend systems.
  • Strong programming skills in one or more languages.
  • Deep understanding of concurrency and distributed systems.
  • Experience with containers and orchestration tools.
  • Familiarity with cloud infrastructure and infrastructure as code.
  • Proven ability to solve complex problems independently.
Good to have:
  • Experience with modern LLM serving frameworks.
  • Knowledge of ML frameworks and optimization.
  • Experience with model inference optimizations.
  • Familiarity with emerging agent frameworks.

Job Details

As a Software Engineer on the ML Infrastructure team, you will design and build platforms for scalable, reliable, and efficient serving of LLMs and AI agents. Our platform powers cutting-edge research and production systems, supporting both internal and external use cases across various environments.

The ideal candidate combines strong ML fundamentals with deep expertise in backend system design. You’ll work in a highly collaborative environment, bridging research and engineering to  deliver seamless experiences to our customers and accelerate innovation across the company.

You will:

  • Build and maintain fault-tolerant, high-performance systems for serving LLMs and agent-based workloads at scale.
  • Collaborate with researchers and engineers to integrate and optimize models for production and research use cases.
  • Conduct architecture and design reviews to uphold best practices in system design and scalability.
  • Develop monitoring and observability solutions to ensure system health and performance.
  • Lead projects end-to-end, from requirements gathering to implementation, in a cross-functional environment. 

Ideally you'd have:

  • 4+ years of experience building large-scale, high-performance backend systems.
  • Strong programming skills in one or more languages (e.g., Python, Go, Rust, C++).
  • Deep understanding of concurrency, memory management, networking, and distributed systems.
  • Experience with containers, virtualization, and orchestration tools (e.g., Docker, Kubernetes).
  • Familiarity with cloud infrastructure (AWS, GCP) and infrastructure as code (e.g., Terraform).
  • Proven ability to solve complex problems and work independently in fast-moving environments.

Nice to haves:

  • Experience with modern LLM serving frameworks such as vLLM, SGLang, TensorRT-LLM, or text-generation-inference.
  • Knowledge of ML frameworks (e.g., PyTorch or TensorFlow) and how to optimize them for production serving.
  • Experience with model inference optimizations such as quantization, distillation, speculative decoding, etc.
  • Familiarity with emerging agent frameworks such as OpenHands, Agent2Agent, MCP.

Similar Jobs

Google - Senior Software Engineer, Machine Learning (Recommendations, Rankings, and Predictions)

Google

Mountain View, California, United States (On-Site)
2 Weeks ago
Resemble AI - Deep Learning Speech Researcher

Resemble AI

Mountain View, California, United States (On-Site)
8 Months ago
Canva - Senior Applied Scientist - AI Research

Canva

Surry Hills, New South Wales, Australia (Remote)
1 Month ago
Rackspace Technology - Machine Learning Architect (AWS)

Rackspace Technology

(Remote)
1 Week ago
Great Learning - Data Scientist

Great Learning

Bengaluru, Karnataka, India (On-Site)
7 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

ByteDance - Software Engineer, Model Inference

ByteDance

Seattle, Washington, United States (On-Site)
2 Months ago
Microsoft - Member of Technical Staff, AI Data

Microsoft

Zürich, Zurich, Switzerland (On-Site)
2 Weeks ago
Google - Field Solutions Architect, Generative AI, Google Cloud

Google

Stockholm, Stockholm County, Sweden (On-Site)
2 Weeks ago
Meta - Research Scientist, Machine Learning (PhD)

Meta

Sunnyvale, California, United States (On-Site)
5 Months ago
Canva - Senior Applied Scientist - AI Research

Canva

Surry Hills, New South Wales, Australia (Remote)
1 Month ago
Scale AI - Machine Learning Engineer, Enterprise

Scale AI

London, England, United Kingdom (On-Site)
1 Day ago
Ethos Life - Data Scientist

Ethos Life

Bengaluru, Karnataka, India (Hybrid)
1 Day ago
PwC - IN-Senior Associate_ML Engineer_Data and Analytics_Advisory_Bangalore

PwC

Bengaluru, Karnataka, India (On-Site)
6 Months ago
Netflix - Product Manager, ML Platform: Training

Netflix

Los Gatos, California, United States (Hybrid)
6 Months ago
Google - ML Accelerator Architect and Performance Engineer, Silicon

Google

New Taipei, New Taipei City, Taiwan (On-Site)
2 Weeks ago

Get notifed when new similar jobs are uploaded

Jobs in San Francisco, California, United States

Bitwise Alchemy - Senior Engine Programmer

Bitwise Alchemy

Texas, United States (Remote)
9 Months ago
Netflix - Director of Creative, Experiences

Netflix

Los Angeles, California, United States (On-Site)
3 Months ago
Scale AI - Senior Software Engineer, Agent Oversight

Scale AI

San Francisco, California, United States (Hybrid)
1 Day ago
Epic Games - Senior Designer, Cosmetic Ecosystem

Epic Games

United States (On-Site)
2 Weeks ago
Samsung Semiconductor - Principal Engineer, AI/ML Software Compiler

Samsung Semiconductor

San Jose, California, United States (On-Site)
2 Weeks ago
Next Level Business Services - Software Design Engineer

Next Level Business Services

Santa Clara, California, United States (On-Site)
6 Months ago
The Walt Disney Company - Director, Marketing Strategy

The Walt Disney Company

Burbank, California, United States (On-Site)
3 Days ago
IMC - Senior Risk Manager

IMC

Chicago, Illinois, United States (On-Site)
23 Hours ago
ByteDance - Algorithm Engineer - Enterprise Solution RD

ByteDance

San Jose, California, United States (On-Site)
2 Weeks ago
PlayStation Global - Senior Producer

PlayStation Global

Santa Monica, California, United States (Hybrid)
1 Week ago

Get notifed when new similar jobs are uploaded

Similar Category Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

About The Company

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

Doha, Doha Municipality, Qatar (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

View All Jobs

Get notified when new jobs are added by Scale AI

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug