Principal ML Infrastructure Engineer

2 Months ago • All levels • Devops • $216,500 PA - $390,750 PA

Job Summary

Job Description

As a Principal ML Infrastructure Engineer at Upwork, you will be responsible for designing, developing, and maintaining robust and scalable ML infrastructure components to support the company's machine learning initiatives. You will work with cross-functional teams including machine learning researchers, data scientists, and software engineers to build state-of-the-art platforms and tools. Responsibilities include owning technical workstreams, contributing to the team's product roadmap, designing and implementing distributed systems, developing tools for the ML lifecycle, and collaborating with researchers. Mentoring teammates and upholding engineering best practices will also be required.
Must have:
  • Senior/Leadership experience in ML infrastructure engineering.
  • Proven track record of delivering impactful solutions.
  • Solid foundation in software engineering and ML concepts.
  • Strong communication and teamwork skills are a must.
  • Stay updated with the latest advancements in the field of AI.
Perks:
  • Comprehensive medical insurance coverage.
  • Unlimited paid time off.
  • 401(k) plan with matching contributions.
  • 12 weeks of paid parental leave.
  • Employee Stock Purchase Plan.

Job Details

Upwork ($UPWK) is the world’s largest work marketplace, connecting businesses with highly skilled professionals worldwide. From entrepreneurs to Fortune 100 enterprises, companies trust Upwork’s platform to access expert talent, leverage AI-powered work solutions, and drive meaningful business outcomes.

Upwork’s AI-powered platform has facilitated over $20 billion in economic opportunity for professionals worldwide. With professionals spanning 10,000+ skills, including AI and machine learning, software development, sales and marketing, customer support, finance and accounting, and more, Upwork empowers businesses of all sizes to scale, innovate, and build agile teams.


The Machine Learning Infrastructure & Data team is responsible for architecting and building the foundational ML systems and tools that enable efficient development, deployment, and management of machine learning models at scale.

As a Principal ML Infrastructure Engineer in the Machine Learning Infrastructure & Data team, you will play a pivotal role in designing, developing, and maintaining robust and scalable ML infrastructure components to support the company's machine learning initiatives. You will collaborate closely with cross-functional teams including machine learning researchers, data scientists, and software engineers to build state-of-the-art platforms and tools that accelerate the development and deployment of machine learning models.

Responsibilities:

  • Own technical workstreams from start to finish, contribute to the team’s product roadmap, and be responsible for major technical decisions and tradeoffs. Effectively participate in team’s planning, code reviews, and design discussions
  • Consider the effects of projects across multiple teams and proactively manage conflicts. Work together with partner teams to achieve cross-departmental goals and satisfy broad requirements
  • Design, implement, and optimize distributed systems and infrastructure components to support large-scale machine learning workflows, including data ingestion, feature engineering, model training, and serving.
  • Develop and maintain frameworks, libraries, and tools to streamline the end-to-end machine learning lifecycle, from data preparation, model training, evaluation, deployment, and monitoring.
  • Architect and implement highly available, fault-tolerant, and secure systems that meet the performance and scalability requirements of production machine learning workloads.
  • Collaborate and publish with machine learning researchers and data scientists on novel research and translate research into scalable and efficient software solutions.
  • Stay current with the latest advancements in machine learning infrastructure, distributed computing, and cloud technologies, and integrate them into our platform to drive innovation.
  • Mentor teammates, conduct code reviews, and uphold engineering best practices to ensure the delivery of high-quality software solutions.

What it takes to catch our eye:

  • Senior/Leadership level experience in ML infrastructure engineering, ideally at an innovative technology company.
  • Proven Impact: Show us your track record of delivering impactful solutions.
  • Innovative Thinker: Bring creativity and fresh ideas to the table.
  • Technical Proficiency: Solid foundation in software engineering and ML concepts.
  • Collaborative Mindset: Strong communication and teamwork skills are a must.
  • Continuous Learner: Stay updated with the latest advancements in the field of AI.
  • Our Team's Tech stack: Compute: AWS, EKS, Databricks - Data: Snowflake, S3, SQLMesh, Feast - Workflow Automation: Airflow - Experiment Tracking: Weights & Biases, MLflow -  LLM Inference: Fireworks, in-house deployment on EKS

Come change how the world works.

At Upwork, you’ll shape talent solutions for how the world works today. We are a remote-first organization working together to create exciting remote work opportunities for a global community of professionals. While we have physical offices in San Francisco and Chicago, currently we also hire full-time employees in 19 states in the United States. 

At the core of our vibrant culture are shared values that form the foundation of our organization. These values revolve around trust, risk-taking, customer focus, and excellence. Our overarching mission is to create economic opportunities so that people have better lives. We foster an environment where individuals are encouraged to bring their authentic selves to work, nurturing personal and professional growth through development opportunities, mentorship programs, and participation in Upwork Belonging Communities.

We take pride in providing exceptional benefits to our employees. These include comprehensive medical insurance coverage for both you and your family, unlimited paid time off, a 401(k) plan with matching contributions, 12 weeks of paid parental leave, and an Employee Stock Purchase Plan. To explore these benefits in detail, as well as gain insights into our company values, working principles, and the overall employee experience, we invite you to visit our Life at Upwork page.

Check out our Careers page to learn more about the employee experience.   

Upwork is proudly committed to recruiting and retaining a diverse and inclusive workforce. As an Equal Opportunity Employer, we never discriminate based on race, religion, color, national origin, gender (including pregnancy, childbirth, or related medical condition), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics.

Similar Jobs

Riot Games - Art Outsourcing II (Weapons Concept)

Riot Games

Los Angeles, California, United States (On-Site)
5 Months ago
Genies.io - Lead Data Engineer

Genies.io

Los Angeles, California, United States (On-Site)
5 Days ago
luxsoft - Technical Lead / Senior Data Engineer

luxsoft

Mexico City, Mexico (Remote)
1 Week ago
Wolters Kluwer - Senior Marketing Specialist - Revenue Marketing

Wolters Kluwer

Toronto, Ontario, Canada (Hybrid)
1 Month ago
PwC - ETIC, Project Manager - Senior Associate

PwC

Cairo, Cairo Governorate, Egypt (On-Site)
1 Month ago
CME Group - Platform Engineer - II

CME Group

Bengaluru, Karnataka, India (On-Site)
2 Months ago
Luxoft - Solution Architect

Luxoft

Poland, Ohio, United States (Remote)
6 Months ago
Rackspace Technology - Cloud Engineer II (Azure and Terraforms)

Rackspace Technology

Gurugram, Haryana, India (On-Site)
2 Weeks ago
NVIDIA - Senior Solution Architect - Hardware

NVIDIA

Beijing, Beijing, China (On-Site)
5 Months ago
Next Level Business Services - Site reliability engineer -SMTP Service Management (Full) Time

Next Level Business Services

Redmond, Washington, United States (On-Site)
8 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Probably Monsters - Senior Publishing Producer

Probably Monsters

Bellevue, Washington, United States (On-Site)
2 Months ago
Intel  - Advanced Device Development Engineer

Intel

Hsinchu, Hsinchu City, Taiwan (On-Site)
3 Weeks ago
Argus - Technical Artist (APAC)

Argus

Australia (Remote)
3 Months ago
Litmus - Business Development Representative- EMEA

Litmus

Berlin, Berlin, Germany (On-Site)
2 Months ago
Paytm - Data Protection Manager

Paytm

Noida, Uttar Pradesh, India (On-Site)
1 Month ago
SBM Management - Assistant Operations Manager - MIT

SBM Management

Cincinnati, Ohio, United States (On-Site)
2 Months ago
WebTech Corporation - Teardown Engineer

WebTech Corporation

Kansas City, Missouri, United States (On-Site)
1 Month ago
Netflix - Team Lead, Launch Operations (Product Discovery & Promotion)

Netflix

Manila, Metro Manila, Philippines (On-Site)
2 Months ago
Nexters - Creative Producer

Nexters

Almaty, Almaty Region, Kazakhstan (Remote)
2 Months ago
Interactive Brokers - Compliance/Legal Associate – Agreements and Disclosure Management (Temp)

Interactive Brokers

Greenwich, Connecticut, United States (Hybrid)
1 Month ago

Get notifed when new similar jobs are uploaded

Jobs in Worldwide

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Devops Jobs

Prophecy - Delivery Solution Architect

Prophecy

(Remote)
1 Month ago
P99 soft - Senior DevOps Engineer

P99 soft

Hyderabad, Telangana, India (On-Site)
2 Months ago
Next Level Business Services - Senior Java, Cloud Foundry Developer (Full Time)

Next Level Business Services

Herndon, Virginia, United States (On-Site)
8 Months ago
Rackspace Technology - Machine Learning Architect (AWS)

Rackspace Technology

San Diego, California, United States (Remote)
3 Months ago
Apple - Senior Site Reliability Engineer

Apple

Culver City, California, United States (On-Site)
1 Month ago
USE Insider - DevOps Engineer

USE Insider

Istanbul, İstanbul, Türkiye (Remote)
7 Months ago
bytedance - Infrastructure Software Engineer in Edge Cloud

bytedance

San Jose, California, United States (On-Site)
2 Months ago
Toast - Senior Full Stack Software Engineer - Communication Platform

Toast

Dublin, County Dublin, Ireland (Hybrid)
2 Weeks ago
HCL Tech - Solution Architect

HCL Tech

California, United States (On-Site)
1 Month ago
bytedance - Software Engineer Intern (Cloud Native Infrastructure)

bytedance

San Jose, California, United States (On-Site)
3 Months ago

Get notifed when new similar jobs are uploaded