Staff Software Engineer, Machine Learning Infrastructure

5 Months ago • 8 Years + • Artificial Intelligence

Job Summary

Job Description

As a Staff Software Engineer on the Machine Learning Infrastructure team at Thumbtack, you will contribute to the design, implementation, and maintenance of scalable ML systems. Responsibilities include defining and driving the technical vision for Thumbtack's next-generation ML infrastructure, leading cross-functional initiatives, architecting critical ML infrastructure components (model serving and RAG systems), establishing technical standards and best practices, mentoring engineering teams, and partnering with senior leadership to align ML capabilities with business objectives. The role involves working with technologies like Go, Python, and modern ML frameworks (PyTorch, TensorFlow).
Must have:
  • 8+ years of engineering experience in distributed systems
  • 4+ years building ML infrastructure at scale
  • Expertise in Go or Python
  • Strong architectural skills
  • Experience mentoring teams
  • Deep understanding of ML workflows
Good to have:
  • Experience with hundreds of production models
  • Expertise with PyTorch/TensorFlow and MLOps tools
  • Generative AI implementation experience
  • High-performing team building experience
  • Cloud-native architectures expertise (AWS, GCP)
  • Experience in fast-growing tech companies
Perks:
  • Virtual-first working model
  • 20 company holidays
  • WiFi reimbursement
  • Cell phone reimbursement
  • Employee Assistance Program

Job Details

A home is the biggest investment most people make, and yet, it doesn’t come with a manual. That's why we’re building the only app homeowners need to effortlessly manage their homes —  knowing what to do, when to do it, and who to hire. With Thumbtack, millions of people care for what matters most, and pros earn billions of dollars through our platform. And as one of the fastest-growing companies in a $600B+ industry — we must be doing something right. 

We are driven by a common goal and the deep satisfaction that comes from knowing our work supports local economies, helps small businesses grow, and brings homeowners peace of mind. We’re seeking people who continually put our purpose first: advocating for pros and customers, embracing change, and choosing teamwork every day.

At Thumbtack, we're creating a new era of home care. If making an impact and the chance to do good inspires you, join us. Imagine what we’ll build together. 

Thumbtack by the Numbers

  • Available nationwide in every U.S. county
  • Over 85 million projects started on Thumbtack
  • More than 11 million 5-star reviews and counting
  • Pros earn billions on our platform
  • 1000+ employees 
  • $3.2 billion valuation (June, 2021) 

About the Machine Learning Infrastructure Team

At Thumbtack, we're solving complex technical challenges across search, ranking, recommendations, pricing optimization, and spam detection. Our ML Infrastructure team leads the architectural vision and implementation of enterprise-wide machine learning capabilities, enabling teams to effectively experiment with and deploy ML models at scale. We're building next-generation infrastructure that powers Thumbtack's AI-first future. For insights into our engineering challenges, visit our engineering blog.

Challenge 

As a Principal ML Infrastructure Engineer, you'll drive the technical vision and strategic direction of Thumbtack's machine learning platform. You'll architect solutions that democratize ML capabilities across the organization while establishing best practices and technical standards. Working closely with senior leadership, you'll shape our technical roadmap for generative AI adoption, feature platform evolution, and ML operational excellence.

Responsibilities

  • Define and drive the technical vision and architecture for Thumbtack's next-generation ML infrastructure
  •  Lead cross-functional initiatives spanning engineering, data science, and product teams to build scalable, enterprise-grade ML systems
  •  Architect and oversee implementation of critical ML infrastructure components including model serving systems and RAG systems that can scale. 
  •  Establish technical standards and best practices for ML engineering across the organization
  •  Mentor and provide technical leadership to engineering teams on ML infrastructure best practices
  •  Partner with senior leadership to align ML infrastructure capabilities with business objectives

What you’ll need

If you don't think you meet all of the criteria below but still are interested in the job, please apply. Nobody checks every box, and we're looking for someone excited to join the team.

  •  8+ years of engineering experience with significant focus on distributed systems
  •  4+ years of hands-on experience building ML infrastructure or ML platforms at scale
  •  Deep expertise in at least one major programming language; proficiency in our core stack (Go, Python) preferred
  •  Proven track record of technical leadership on complex, cross-functional projects
  •  Strong architectural skills with experience designing scalable, reliable distributed systems
  •  Deep understanding of ML workflows, common frameworks, and operational challenges
  •  Experience mentoring teams and driving engineering excellence
  •  Track record of making strategic technical decisions with organization-wide impact

Bonus points if you have

  •  Experience building AI platforms that support hundreds of models in production
  •  Deep expertise with modern ML frameworks (PyTorch, TensorFlow) and MLOps tools
  •  Experience implementing generative AI capabilities at enterprise scale
  •  Track record of building high-performing technical teams
  •  Expertise with cloud-native architectures and major cloud providers (AWS, GCP)
  •  Experience driving technical strategy at fast-growing technology companies

Thumbtack is a virtual-first company, meaning you can live and work from any one of our approved locations across the United States, Canada or the Philippines.* Learn more about our virtual-first working model here.

#LI-Remote

Benefits & Perks
  • Virtual-first working model coupled with in-person events
  • 20 company-wide holidays including a week-long end-of-year company shutdown
  • Library (optional use collaboration & connection hub) in San Francisco
  • WiFi reimbursements 
  • Cell phone reimbursements (North America) 
  • Employee Assistance Program for mental health and well-being 

Learn More About Us

Thumbtack embraces diversity. We are proud to be an equal opportunity workplace and do not discriminate on the basis of sex, race, color, age, pregnancy, sexual orientation, gender identity or expression, religion, national origin, ancestry, citizenship, marital status, military or veteran status, genetic information, disability status, or any other characteristic protected by federal, provincial, state, or local law. We also will consider for employment qualified applicants with arrest and conviction records, consistent with applicable law. 

Thumbtack is committed to working with and providing reasonable accommodation to individuals with disabilities. If you would like to request a reasonable accommodation for a medical condition or disability during any part of the application process, please contact: recruitingops@thumbtack.com

If you are a California resident, please review information regarding your rights under California privacy laws contained in Thumbtack’s Privacy policy available at https://www.thumbtack.com/privacy/ .

Similar Jobs

ByteDance - Software Engineer Intern (Doubao (Seed) - Machine Learning System) - 2025 Summer (PhD)

ByteDance

San Jose, California, United States (On-Site)
6 Months ago
Netflix - Research Engineer L4/L5 -LLMs for Search, Recommendations, and Personalization

Netflix

Los Gatos, California, United States (On-Site)
6 Months ago
Outlier - Software Engineer (Python)

Outlier

Faridabad, Haryana, India (Hybrid)
6 Months ago
ASSIST Software - Other Positions

ASSIST Software

Suceava, Suceava County, Romania (Remote)
5 Months ago
Tencent - Game Research & Development Intern, Engine Research

Tencent

Bellevue, Washington, United States (On-Site)
2 Months ago
NVIDIA - Distinguished Engineer, AI Resiliency Lead

NVIDIA

Santa Clara, California, United States (On-Site)
3 Months ago
Google - Software Engineer III, Machine Learning, Search

Google

Seattle, Washington, United States (On-Site)
5 Months ago
Outlier - Software Engineer (Python)

Outlier

Faridabad, Haryana, India (Hybrid)
6 Months ago
NVIDIA - Senior Solutions Architect, Global Partner Team

NVIDIA

Santa Clara, California, United States (On-Site)
3 Months ago
NVIDIA - Director of Product - AI Training Platform Software

NVIDIA

Santa Clara, California, United States (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Tencent - Research Intern

Tencent

Palo Alto, California, United States (On-Site)
2 Months ago
NVIDIA - Senior System Software Engineer, NCCL - Partner Enablement

NVIDIA

Austin, Texas, United States (Remote)
1 Month ago
ByteDance - AI Security Researcher - Security Flow

ByteDance

San Jose, California, United States (On-Site)
6 Months ago
Hedra - Research Scientist

Hedra

New York, New York, United States (On-Site)
1 Month ago
NVIDIA - Senior Solution Engineer, Mission Control

NVIDIA

Santa Clara, California, United States (On-Site)
1 Month ago
CloudHire - Backend Software Engineer

CloudHire

India (Remote)
1 Month ago
Rackspace Technology - Machine Learning Architect (AWS)

Rackspace Technology

(Remote)
3 Months ago
ByteDance - Machine Learning Engineer - MLDev

ByteDance

San Jose, California, United States (On-Site)
1 Month ago
ByteDance - Algorithm Engineer - Audio Understanding - Start 2025

ByteDance

Singapore (On-Site)
6 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Ontario, Canada

NVIDIA - Senior System Software Engineer, Robotics Simulation

NVIDIA

Toronto, Ontario, Canada (Hybrid)
1 Month ago
Next Level Games - UI Artist

Next Level Games

Vancouver, British Columbia, Canada (Hybrid)
1 Month ago
Ubisoft - Programming Team Lead (Engine)

Ubisoft

Saguenay, Quebec, Canada (Hybrid)
11 Months ago
Ubisoft - Senior Producer

Ubisoft

Toronto, Ontario, Canada (On-Site)
3 Months ago
Xsolla - Mobile SDK Developer (iOS)

Xsolla

Montreal, Quebec, Canada (Hybrid)
4 Months ago
Electronic Arts - Senior Analyst - NHL

Electronic Arts

Vancouver, British Columbia, Canada (Hybrid)
1 Month ago
Scientific Games  - Production Director

Scientific Games

Montreal, Quebec, Canada (On-Site)
2 Months ago
Amber - Localization Quality Assurance (Polish)

Amber

Quebec, Canada (Hybrid)
2 Months ago
Epic Games - Senior SDET

Epic Games

Montreal, Quebec, Canada (On-Site)
1 Month ago
Amber - Localization Quality Assurance (Japanese)

Amber

Quebec, Canada (Hybrid)
2 Months ago

Get notifed when new similar jobs are uploaded

Artificial Intelligence Jobs

KPIT - CTO_ML/DL Data scientist

KPIT

Pune, Maharashtra, India (On-Site)
5 Months ago
NVIDIA - Machine Learning Software Platform Architect

NVIDIA

Canada (On-Site)
2 Months ago
Canva - Senior Backend Engineer - AI Enablement

Canva

Surry Hills, New South Wales, Australia (Remote)
1 Month ago
ByteDance - AI Security Researcher - Security - San Jose

ByteDance

San Jose, California, United States (On-Site)
6 Months ago
PwC - Senior AI Developer - Roma [DIG]

PwC

Rome, Lazio, Italy (On-Site)
7 Months ago
NVIDIA - Solution Architect, Generative AI - Digital Human

NVIDIA

Santa Clara, California, United States (On-Site)
3 Months ago
Canva - Senior Machine Learning Engineer - Photo AI

Canva

Vienna, Vienna, Austria (Remote)
3 Months ago
Google - Senior Software Engineer, Machine Learning, Google Ads

Google

Los Angeles, California, United States (On-Site)
4 Months ago
Meta - Software Engineer, Systems ML - SW/HW Co-design

Meta

Menlo Park, California, United States (Remote)
5 Months ago
Lionbridge Games - Games Language AI Specialist (Linguist)

Lionbridge Games

Masovian Voivodeship, Poland (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded