AI Infrastructure Engineer, Model Serving Platform

5 Months ago • 4 Years + • $175,000 PA - $220,000 PA
Devops

Job Description

As a Software Engineer on the ML Infrastructure team, you will design and build platforms for scalable, reliable, and efficient serving of LLMs and AI agents. The ideal candidate combines strong ML fundamentals with deep expertise in backend system design. You’ll work in a highly collaborative environment, bridging research and engineering to deliver seamless experiences to our customers and accelerate innovation across the company. You will build and maintain fault-tolerant, high-performance systems, collaborate with researchers and engineers, conduct architecture and design reviews, develop monitoring and observability solutions, and lead projects end-to-end.
Good To Have:
  • Experience with modern LLM serving frameworks.
  • Knowledge of ML frameworks and optimization.
  • Experience with model inference optimizations.
  • Familiarity with emerging agent frameworks.
Must Have:
  • 4+ years of experience building large-scale backend systems.
  • Strong programming skills in one or more languages.
  • Deep understanding of concurrency and distributed systems.
  • Experience with containers and orchestration tools.
  • Familiarity with cloud infrastructure and infrastructure as code.
  • Proven ability to solve complex problems independently.

Add these skills to join the top 1% applicants for this job

cross-functional
cpp
game-texts
networking
aws
rust
model-serving
terraform
pytorch
docker
kubernetes
python
tensorflow
system-design

As a Software Engineer on the ML Infrastructure team, you will design and build platforms for scalable, reliable, and efficient serving of LLMs and AI agents. Our platform powers cutting-edge research and production systems, supporting both internal and external use cases across various environments.

The ideal candidate combines strong ML fundamentals with deep expertise in backend system design. You’ll work in a highly collaborative environment, bridging research and engineering to  deliver seamless experiences to our customers and accelerate innovation across the company.

You will:

  • Build and maintain fault-tolerant, high-performance systems for serving LLMs and agent-based workloads at scale.
  • Collaborate with researchers and engineers to integrate and optimize models for production and research use cases.
  • Conduct architecture and design reviews to uphold best practices in system design and scalability.
  • Develop monitoring and observability solutions to ensure system health and performance.
  • Lead projects end-to-end, from requirements gathering to implementation, in a cross-functional environment. 

Ideally you'd have:

  • 4+ years of experience building large-scale, high-performance backend systems.
  • Strong programming skills in one or more languages (e.g., Python, Go, Rust, C++).
  • Deep understanding of concurrency, memory management, networking, and distributed systems.
  • Experience with containers, virtualization, and orchestration tools (e.g., Docker, Kubernetes).
  • Familiarity with cloud infrastructure (AWS, GCP) and infrastructure as code (e.g., Terraform).
  • Proven ability to solve complex problems and work independently in fast-moving environments.

Nice to haves:

  • Experience with modern LLM serving frameworks such as vLLM, SGLang, TensorRT-LLM, or text-generation-inference.
  • Knowledge of ML frameworks (e.g., PyTorch or TensorFlow) and how to optimize them for production serving.
  • Experience with model inference optimizations such as quantization, distillation, speculative decoding, etc.
  • Familiarity with emerging agent frameworks such as OpenHands, Agent2Agent, MCP.

Set alerts for more jobs like AI Infrastructure Engineer, Model Serving Platform
Set alerts for new jobs by Scale AI
Set alerts for new Devops jobs in United States
Set alerts for new jobs in United States
Set alerts for Devops (Remote) jobs

Contact Us
hello@outscal.com
Made in INDIA 💛💙