ML Inference Router Engineer

37 Minutes ago • 10 Years + • $132,000 PA - $222,100 PA
Research Development

Job Description

eBay’s AI Platform team is building the next generation of agentic and inference technologies. This role involves designing and building a highly scalable, low-latency ML inference gateway to support billions of daily requests, orchestrating across diverse large language models (LLMs). The engineer will develop distributed, fault-tolerant systems, optimize throughput, cost, and reliability, and integrate new models and agentic capabilities, shaping the backbone of AI at a global scale.
Good To Have:
  • Experience with inference serving frameworks (vLLM, Triton, TensorRT-LLM, FasterTransformer, DeepSpeed-MII).
  • Familiarity with LLM tokenization, batching, and scheduling strategies.
  • Background in microservice API gateway design (rate limiting, routing policies, authentication).
  • Experience with real-time monitoring, tracing, and autoscaling of high-throughput systems.
  • Contributions to open-source distributed systems or ML serving projects.
Must Have:
  • Design and build a scalable LLM inference gateway.
  • Develop intelligent request routing and load balancing for LLM backends.
  • Optimize throughput, cost, and reliability of inference workloads.
  • Collaborate with teams to integrate new models and agentic capabilities.
  • Implement observability, tracing, and autoscaling for inference traffic.
  • Conduct design and code reviews for distributed systems architecture.
  • Stay current with LLM serving, inference acceleration, and model APIs.
  • 10+ years experience in large-scale, fault-tolerant distributed systems.
  • Strong programming skills in Java, Go, Rust, or C++.
  • Deep understanding of networking, concurrency, memory management, performance tuning.
  • Proven experience designing and operating low-latency APIs at scale.
  • Hands-on experience with Kubernetes, service meshes, container orchestration.
  • Strong background in cloud infrastructure (AWS, GCP, Azure) and distributed system design.
Perks:
  • Medical benefits
  • Financial benefits
  • 401(k) eligibility
  • Paid time off (PTO)
  • Parental leave

Add these skills to join the top 1% applicants for this job

cpp
game-texts
networking
aws
rust
load-balancing
azure
kubernetes
java
system-design

At eBay, we're more than a global ecommerce leader — we’re changing the way the world shops and sells. Our platform empowers millions of buyers and sellers in more than 190 markets around the world. We’re committed to pushing boundaries and leaving our mark as we reinvent the future of ecommerce for enthusiasts.

Our customers are our compass, authenticity thrives, bold ideas are welcome, and everyone can bring their unique selves to work — every day. We're in this together, sustaining the future of our customers, our company, and our planet.

Join a team of passionate thinkers, innovators, and dreamers — and help us connect people and build communities to create economic opportunity for all.

About the team and the role:

eBay’s AI Platform team is building the next generation of agentic and inference technologies that power AI experiences for hundreds of millions of users worldwide. We are seeking an ML Interence Router Engineer to design and build a highly scalable, low-latency inference gateway capable of supporting billions of daily requests.

This role sits at the core of eBay’s AI infrastructure—developing distributed, fault-tolerant systems that orchestrate requests across diverse large language models (LLMs) and ensure high reliability, efficiency, and cost-effectiveness. If you are passionate about large-scale systems engineering, love solving hard performance problems, and want to shape the backbone of AI at global scale, we’d love to hear from you.

What you will accomplish:

  • Design and build an LLM inference gateway that scales to billions of daily requests with millisecond-level latency.
  • Develop intelligent request routing, load balancing, and fallback mechanisms across heterogeneous LLM backends (internal and external).
  • Optimize throughput, cost, and reliability of inference workloads in multi-tenant environments.
  • Collaborate with platform, research, and product teams to integrate new models and agentic capabilities into the gateway.
  • Implement observability, tracing, and autoscaling for inference traffic across Kubernetes-based clusters.
  • Conduct design and code reviews to ensure high standards in distributed systems architecture.
  • Stay current with advances in LLM serving, inference acceleration, and model APIs to continuously evolve the platform.

What you will bring:

  • 10+ years of experience building large-scale, fault-tolerant, high-performance distributed systems.
  • Strong programming skills in one or more of Java, Go, Rust, or C++ (Java preferred for gateway services).
  • Deep understanding of networking, concurrency, memory management, and performance tuning in production systems.
  • Proven experience designing and operating low-latency APIs at very large scale (10M+ QPS).
  • Hands-on experience with Kubernetes, service meshes, and container orchestration at scale.
  • Strong background in cloud infrastructure (AWS, GCP, Azure) and distributed system design.

Bonus Skills:

  • Experience with inference serving frameworks (vLLM, Triton, TensorRT-LLM, FasterTransformer, DeepSpeed-MII, or similar).
  • Familiarity with LLM tokenization, batching, and scheduling strategies.
  • Background in microservice API gateway design (rate limiting, routing policies, authentication).
  • Experience with real-time monitoring, tracing, and autoscaling of high-throughput systems.
  • Contributions to open-source distributed systems or ML serving projects.

#LI-Hybrid

The base pay range for this position is expected in the range below:

$132,000 - $222,100

Base pay offered may vary depending on multiple individualized factors, including location, skills, and experience. The total compensation package for this position may also include other elements, including a target bonus and restricted stock units (as applicable) in addition to a full range of medical, financial, and/or other benefits (including 401(k) eligibility and various paid time off benefits, such as PTO and parental leave). Details of participation in these benefit plans will be provided if an employee receives an offer of employment.

If hired, employees will be in an “at-will position” and the Company reserves the right to modify base salary (as well as any other discretionary payment or compensation program) at any time, including for reasons related to individual performance, Company or individual department/team performance, and market factors.

Please see the Talent Privacy Notice

for information regarding how eBay handles your personal data collected when you use the eBay Careers website or apply for a job with eBay.

eBay is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, national origin, sex, sexual orientation, gender identity, veteran status, and disability, or other legally protected status. If you have a need that requires accommodation, please contact us at talent@ebay.com

. We will make every effort to respond to your request for accommodation as soon as possible. View our accessibility statement

to learn more about eBay's commitment to ensuring digital accessibility for people with disabilities. It is unlawful in Massachusetts to require or administer a lie detector test as a condition of employment or continued employment. An employer who violates this law shall be subject to criminal penalties and civil liability.

The eBay Jobs website uses cookies to enhance your experience. By continuing to browse the site, you agree to our use of cookies. Visit our Privacy Center

for more information.

About Us

We Empower People and Create Economic Opportunity

eBay Inc. (NASDAQ: EBAY) is a global commerce leader that connects millions of buyers and sellers around the world. We exist to enable economic opportunity for individuals, entrepreneurs, businesses and organizations of all sizes.

Read More

Follow Us

  • [](https://www.youtube.com/user/ebay "YouTube")
  • [](https://twitter.com/eBay "X")
  • [](https://www.linkedin.com/company/ebay "LinkedIn")
  • [](https://www.glassdoor.com/Overview/Working-at-eBay-EI_IE7853.11,15.htm "Glassdoor")

© 2025 Workday, Inc. All rights reserved.

Set alerts for more jobs like ML Inference Router Engineer
Set alerts for new jobs by eBay
Set alerts for new Research Development jobs in United States
Set alerts for new jobs in United States
Set alerts for Research Development (Remote) jobs

Contact Us
hello@outscal.com
Made in INDIA 💛💙