Staff Backend Engineer, Speech AI - USA

12 Minutes ago • 5 Years + • $200,000 PA - $300,000 PA
Backend Development

Job Description

Inworld is seeking a Staff Backend Engineer to design, build, and scale critical backend infrastructure for real-time conversational AI applications. This role involves owning features end-to-end, translating product requirements into robust, scalable services, and developing user-facing APIs and backend systems in Python, Java/Kotlin, and Go. The engineer will focus on creating a robust, low-latency platform for demanding tasks like Text-to-Speech (TTS) and Speech-to-Text (STT), powering the next generation of AI-driven software.
Good To Have:
  • A customer-first mindset and a passion for creating intuitive and powerful developer or end-user experiences.
  • Experience building backend services for speech processing (TTS/STT) or other real-time ML applications.
  • The ability to thrive in a fast-paced, collaborative environment and a bias for action and rapid iteration.
  • Familiarity with major cloud platforms like GCP or AWS.
Must Have:
  • Own features end-to-end, from collaborating on the initial concept with product managers to shipping and monitoring the final product.
  • Translate product requirements and user needs into robust, scalable, and maintainable backend services and APIs.
  • Design, build, and launch user-facing APIs and backend systems in Python, Java/Kotlin, and Go.
  • Partner closely with Product Managers and ML engineers to define scope, identify technical trade-offs, and drive the product roadmap forward.
  • Write high-quality, production-grade code that powers real-time audio processing, model inference, and complex data pipelines.
  • Champion engineering and product excellence, with a focus on delivering tangible value to our users quickly and iteratively.
  • A BA/BS degree in Computer Science or a related technical field, or equivalent practical experience.
  • 5+ years of professional experience in software development, with a proven track record of shipping high-quality, user-facing products.
  • Strong product sense and an ability to think critically about user experience and business impact.
  • Demonstrated experience in building and scaling production-grade backend APIs and distributed systems.
  • Strong proficiency in Python and professional experience with one or more of the following: Java/Kotlin, or Go.
  • Hands-on experience with containerization (Docker) and deploying services on orchestration platforms like Kubernetes.
  • A solid foundation in data structures, algorithms, and system design.
Perks:
  • Bonus
  • Equity
  • Benefits
  • Relocation assistance

Add these skills to join the top 1% applicants for this job

data-structures
game-texts
user-experience-ux
aws
docker
kubernetes
kotlin
python
algorithms
java
system-design

About Inworld

At Inworld, we believe that the benefits of AI should extend beyond business workflows to the applications and experiences that we enjoy every day. We began by pushing the frontier of lifelike, interactive characters for games and entertainment, pioneering realtime conversational AI at scale. Today, we apply that expertise to provide the multimodal models, pipelines and tools needed to build and evolve consumer-scale, real-time conversational AI applications across learning, health, social, assistants, games and media.

We’ve raised more than $125M from Lightspeed, Section 32, Kleiner Perkins, Microsoft’s M12 venture fund, Founders Fund, Meta and Stanford, among others. Our technology has powered experiences from companies such as NVIDIA, Microsoft Xbox, Niantic, Logitech Streamlabs, Wishroll, Little Umbrella and Bible Chat. We’ve also been recognized by CB Insights as one of the 100 most promising AI companies globally and have been named one of LinkedIn's Top 10 Startups in the USA.

About the role

Our intelligent runtime must seamlessly connect to foundational models to power real-time, interactive experiences. For this to be possible at scale, the infrastructure that serves these models, especially for demanding tasks like Text-to-Speech (TTS) and Speech-to-Text (STT), must be exceptionally fast, reliable, and cost-effective.

We are seeking a Staff Backend Engineer to build this critical infrastructure. You will be responsible for designing, building, and scaling the backend systems that serve our voice production models. Your work will focus on the difficult engineering problems of building a robust, low-latency platform that forms the backbone of the next generation of AI-driven software.

Responsibilities

  • Own features end-to-end, from collaborating on the initial concept with product managers to shipping and monitoring the final product.
  • Translate product requirements and user needs into robust, scalable, and maintainable backend services and APIs.
  • Design, build, and launch user-facing APIs and backend systems in Python, Java/Kotlin, and Go that deliver seamless voice experiences.
  • Partner closely with Product Managers and ML engineers to define scope, identify technical trade-offs, and drive the product roadmap forward.
  • Write high-quality, production-grade code that powers real-time audio processing, model inference, and complex data pipelines.
  • Champion engineering and product excellence, with a focus on delivering tangible value to our users quickly and iteratively.

Qualifications

  • A BA/BS degree in Computer Science or a related technical field, or equivalent practical experience.
  • 5+ years of professional experience in software development, with a proven track record of shipping high-quality, user-facing products.
  • Strong product sense and an ability to think critically about user experience and business impact.
  • Demonstrated experience in building and scaling production-grade backend APIs and distributed systems.
  • Strong proficiency in Python and professional experience with one or more of the following: Java/Kotlin, or Go.
  • Hands-on experience with containerization (Docker) and deploying services on orchestration platforms like Kubernetes.
  • A solid foundation in data structures, algorithms, and system design.

A good fit for this role may have

  • A customer-first mindset and a passion for creating intuitive and powerful developer or end-user experiences.
  • Experience building backend services for speech processing (TTS/STT) or other real-time ML applications.
  • The ability to thrive in a fast-paced, collaborative environment and a bias for action and rapid iteration.
  • Familiarity with major cloud platforms like GCP or AWS.

We believe in the power of in-person collaboration to solve the hardest problems and foster a strong team culture. We offer relocation assistance and look forward to you joining us in our Mountain View office.

The base salary range for this full-time position is $200,000 - $300,000+ bonus + equity + benefits.

Set alerts for more jobs like Staff Backend Engineer, Speech AI - USA
Set alerts for new jobs by Inworld AI
Set alerts for new Backend Development jobs in United States
Set alerts for new jobs in United States
Set alerts for Backend Development (Remote) jobs

Contact Us
hello@outscal.com
Made in INDIA 💛💙