Staff Platform Engineer, MLOps

2 Months ago • 7-7 Years • Devops • $180,000 PA - $280,000 PA

Job Summary

Job Description

As a Staff Platform Engineer (MLOps), you will design, deploy, and maintain cloud infrastructure for Inworld's AI Engine and Studio. Responsibilities include developing and optimizing the ML model lifecycle using the Inworld AI platform and Nvidia CUDA; implementing CI/CD systems for ML workflows; monitoring models for issues; designing MLOps tools; and fostering a 'you build it, you run it' culture. You will manage CI/CD pipelines, enhance engineering speed and efficiency, conduct root cause analysis, and share best practices. The role requires extensive experience with infrastructure-as-code, Kubernetes, CI/CD pipelines (Terraform/Terragrunt, ArgoCD, etc.), and cloud platforms.
Must have:
  • 7+ years software engineering experience
  • 5+ years infrastructure-as-code experience
  • Kubernetes cluster management proficiency
  • CI/CD pipeline creation and maintenance
  • Cloud provider expertise (GCP, Azure, Oracle)
  • Backend programming (Golang, Python, Bash)
Good to have:
  • Open source LLM & serving solution familiarity
  • SLURM experience
  • Data pipeline & workflow management tools experience
  • Bare metal GPU experience

Job Details

view open roles

Why Join Inworld

Inworld is the leading provider of AI technology for real-time interactive experiences, with a $500 million valuation and backing from top tier investors including Intel Capital, Microsoft’s M12 fund, Lightspeed Venture Partners, Section 32, BITKRAFT Ventures, Kleiner Perkins, Founders Fund, and First Spark Ventures.

Inworld provides the market’s best framework for building production ready interactive experiences, coupled with dedicated services to optimize specific stages of development – from design and development, to ML pipeline optimization and custom compute infrastructure. We help developers bring their AI engines in-house with a framework optimized for real-time data ingestion, low latency, and massive scale. Inworld powers experiences built by Ubisoft, NVIDIA, Niantic, NetEase Games and LG, among others, and has partnerships with key industry players such as Microsoft Xbox, Epic Games, and Unity. 

Inworld was recognized by CB Insights as one of the 100 most promising AI companies in the world in 2024 and was named among LinkedIn's Top Startups of 2024 in the USA.

 

About the Role:

As a Staff Platform Engineer (MLOps), you'll work closely with backend and ML Engineering teams to design, deploy, and maintain reliable, high-performance, and secure cloud infrastructure for our AI Engine and Studio. 

 

What you'll do:

  • Develop, manage, and optimize the ML model lifecycle in production using the Inworld AI platform and Nvidia CUDA, implementing CI/CD systems for ML workflows, monitoring models to identify issues and inefficiencies, and designing MLOps tools and frameworks to enhance automation and efficiency.
  • Facilitate a "you build it, you run it" culture by providing the necessary tools and processes for monitoring the reliability, availability, and performance of services.
  • Manage CI/CD pipelines to ensure smooth and efficient code integration and deployment.
  • Identify and implement opportunities to enhance engineering speed and efficiency.
  • Conduct root cause analysis to identify critical issues and develop automated solutions to prevent recurrence.
  • Develop and share best practices to improve automation and efficiency across our engineering teams.

 

Expected experience:

  • 7 years of experience in software engineering.
  • 5 years of experience with infrastructure-as-code.
  • Proficiency in managing Kubernetes clusters and applications, including creating Helm charts/Kustomize manifests for new applications.
  • Experience in creating and maintaining CI/CD pipelines for both applications and infrastructure deployments (using tools like Terraform/Terragrunt, ArgoCD, GitHub Actions, Ansible, etc.).
  • Deep knowledge of at least one major cloud provider (Google Cloud Platform, Microsoft Azure, Oracle Cloud).
  • Proficient in at least one backend programming/scripting languages such as Golang, Python, and Bash.
  • Familiarity with open source LLM and open source serving solution (e.g. vLLM or llama.cpp, kserve, etc) is a plus.
  • Experience with SLURM
  • Experience with data pipeline and workflow management tools
  • Experience with bare metal GPUs (optional)

 

In-office location: Mountain View, CA, United States. You must be available for hybrid work. 

 

The US base salary range for this full-time position is $180,000 - $280,000. In addition to base pay, total compensation includes equity and benefits. Within the range, individual pay is determined by work location, level, and additional factors, including competencies, experience, and business needs. The base pay range is subject to change and may be modified in the future.

Inworld Jobs Privacy

Similar Jobs

Milestone - Director, Regional Business & People Support

Milestone

Modi'in-Maccabim-Re'ut, Center District, Israel (Hybrid)
1 Month ago
Ion - Principal Technical Consultant - Openlink

Ion

Noida, Uttar Pradesh, India (On-Site)
1 Year ago
Roof Stacks - Backend Developer

Roof Stacks

Istanbul, İstanbul, Türkiye (On-Site)
3 Months ago
PwC - Service Delivery Manager

PwC

Dublin, County Dublin, Ireland (On-Site)
1 Month ago
Thales - Solutions Customer Service Assistant

Thales

Singapore (Hybrid)
1 Month ago
techholding - Senior DevOps Engineer

techholding

Pune, Maharashtra, India (On-Site)
2 Months ago
Axon - Sr. Solutions Architect, Fusus

Axon

Denver, Colorado, United States (Hybrid)
1 Month ago
Rippling - Staff Platform Engineer (Backend) - HRIS Platform

Rippling

San Francisco, California, United States (On-Site)
7 Months ago
Synechron - .NET Developer (Cloud, Front-End & Database Proficiency)

Synechron

Pune, Maharashtra, India (On-Site)
2 Weeks ago
Lead Venture - Infrastructure Engineer III

Lead Venture

Gurugram, Haryana, India (On-Site)
8 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Boomi  - Software Senior Principal Engineer

Boomi

India (On-Site)
1 Month ago
Thales - Controller & Accountant

Thales

Herstal, Wallonia, Belgium (Hybrid)
1 Month ago
version 1 - Power App Platform Functional Lead - SME

version 1

Dublin, County Dublin, Ireland (On-Site)
3 Weeks ago
ness digital  - Application Operations Engineer

ness digital

United States (Hybrid)
1 Month ago
PwC - Senior Associate_Hadoop Developer_Advisory Corporate_Advisory_Bangalore Millenia

PwC

Bengaluru, Karnataka, India (On-Site)
9 Months ago
Nagarro - Senior Staff Engineer, Java Developer

Nagarro

Atlanta, Georgia, United States (On-Site)
8 Months ago
Wind River - Sr. Financial Analyst

Wind River

Walnut Creek, California, United States (Hybrid)
4 Weeks ago
Salesforce - Sales Compensation Senior Analyst

Salesforce

Indianapolis, Indiana, United States (Hybrid)
1 Month ago
endava - Data Consultant

endava

Sofia, Sofia City Province, Bulgaria (On-Site)
3 Weeks ago
Experian - Junior Buyer

Experian

Bogotá, Bogota, Colombia (On-Site)
3 Weeks ago

Get notifed when new similar jobs are uploaded

Jobs in Mountain View, California, United States

Apple - Engineering Project Manager, Retail Engineering

Apple

Austin, Texas, United States (On-Site)
2 Weeks ago
Snail Studios - Senior Level Designer

Snail Studios

Los Angeles, California, United States (On-Site)
1 Month ago
Adtran - Services Engineer

Adtran

Huntsville, Alabama, United States (Hybrid)
1 Month ago
Next Level Business Services - Salesforce Devops Engineer

Next Level Business Services

Agoura Hills, California, United States (On-Site)
8 Months ago
Apple - Machine Learning Engineer, On-Device ML - Apple Ads

Apple

New York, New York, United States (On-Site)
2 Weeks ago
Bonfire Studios - Senior Gameplay Animator

Bonfire Studios

California, United States (Hybrid)
3 Months ago
Collaborative Robotics - System Safety Engineer, Reliability

Collaborative Robotics

Santa Clara, California, United States (On-Site)
1 Month ago
GHX - Inventory Specialist

GHX

Tampa, Florida, United States (On-Site)
2 Months ago
Hawkeye Innovations - IT Support Coordinator

Hawkeye Innovations

Atlanta, Georgia, United States (Hybrid)
2 Months ago
Square - Delivery Driver

Square

Lynn, Massachusetts, United States (On-Site)
1 Week ago

Get notifed when new similar jobs are uploaded

Devops Jobs

Journee - Senior Cloud Infrastructure Engineer

Journee

Berlin, Berlin, Germany (Hybrid)
8 Months ago
Revenera - Senior Site Reliability Engineer

Revenera

Bengaluru, Karnataka, India (On-Site)
2 Months ago
Hawkeye Innovations - DevOps Tech Lead

Hawkeye Innovations

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)
2 Months ago
HCL Tech - Enterprise solution architect

HCL Tech

New Jersey, United States (On-Site)
4 Weeks ago
luxsoft - Senior DevOps Engineer (with Python experience)

luxsoft

Romania (Remote)
3 Weeks ago
bytedance - AI and Cloud Solution Architect

bytedance

Singapore (On-Site)
2 Months ago
bytedance - Solution Engineer - Indonesia Market, Lark APAC

bytedance

Jakarta, Jakarta, Indonesia (On-Site)
8 Months ago
PhonePe - Site Reliability Engineer - Azure

PhonePe

Bengaluru, Karnataka, India (On-Site)
1 Month ago
bytedance - Software Engineer - Compute Infrastructure (Orchestration & Scheduling)

bytedance

Seattle, Washington, United States (On-Site)
2 Months ago
Coupa - Lead Site Reliability Engineer

Coupa

Pune, Maharashtra, India (Hybrid)
2 Months ago

Get notifed when new similar jobs are uploaded