Staff Machine Learning Engineer (Infrastructure)

1 Month ago • All levels • Devops

Job Summary

Job Description

Nu, a leading digital banking platform with over 105 million customers, is seeking a Staff Machine Learning Engineer (Infrastructure). This role focuses on building and scaling foundational cloud, data, and AI infrastructure to support machine learning workloads across the organization. The engineer will design and optimize high-performance training, inference, and data processing systems, ensuring reliability, scalability, and efficiency. Responsibilities include enabling AI practitioners with robust compute, model serving, monitoring, and orchestration frameworks to drive innovation and operational excellence.
Must have:
  • Expertise in cloud infrastructure (AWS or GCP)
  • Experience with Kubernetes and container orchestration
  • Experience with infrastructure as code (Terraform, Pulumi)
  • Experience writing ETL pipelines
  • Experience with ML infrastructure (training, inference, monitoring)
  • Knowledge of networking, storage, and security in large-scale systems
  • Experience optimizing performance and cost efficiency of AI workloads
  • Proven track record of leading complex infrastructure projects
  • Experience in designing high-availability, fault-tolerant systems for AI/ML
  • Hands-on experience with monitoring, observability, and alerting for production systems
Good to have:
  • Proficiency in Python and Go
  • Experience with Spark or BigQuery
  • Familiarity with workflow orchestration tools (e.g., Dagster, Airflow)
  • Familiarity with model-serving frameworks (e.g., Ray Serve, vLLM)
  • Experience in developer tooling, platform engineering, or ML infrastructure
Perks:
  • Remote work
  • Quarterly trips to Sao Paulo
  • Top Tier Medical Insurance
  • Top Tier Dental and Vision Insurance
  • 20 days time off
  • 14 company holidays
  • Work-life balance
  • Life Insurance and AD&D
  • Extended maternity and paternity leaves
  • Nucleo - Learning platform
  • NuLanguage - Language learning program
  • NuCare - Mental health and wellness assistance
  • 401K
  • Saving Plans - Health Saving Account and Flexible Spending Account

Job Details

About Nu

Nu is the world’s largest digital banking platform outside of Asia, serving over 105 million customers across Brazil, Mexico, and Colombia. The company has been leading an industry transformation by leveraging data and proprietary technology to develop innovative products and services. Guided by its mission to fight complexity and empower people, Nu caters to customers’ complete financial journey, promoting financial access and advancement with responsible lending and transparency. The company is powered by an efficient and scalable business model that combines low cost to serve with growing returns. Nu’s impact has been recognized in multiple awards, including Time 100 Companies, Fast Company’s Most Innovative Companies, and Forbes World’s Best Banks. Learn more: https://international.nubank.com.br/careers/

 

About the role

At Nubank, one of our engineering principles is "Leverage Through Platforms". We believe that platforms are a very efficient way of solving complex concerns that are needed for different products and teams.

The AI Infrastructure Squad within the AI Core BU builds and scales the foundational cloud, data, and AI infrastructure that powers machine learning workloads across the organization. We design and optimize high-performance training, inference, and data processing systems while ensuring reliability, scalability, and efficiency. Our team enables AI practitioners by providing robust compute, model serving, monitoring, and orchestration frameworks to drive innovation and operational excellence.

 

As a Software Engineer in the AI Core BU, we expect you to demonstrate:

  • Strong expertise in cloud infrastructure (AWS or GCP) and distributed computing.
  • Experience with Kubernetes, container orchestration, and infrastructure as code (Terraform, Pulumi).
  • Proficiency in programming languages. Experience with Python and Go is a plus.
  • Experience writing ETL pipelines (experience with Spark or BigQuery is preferred)
  • Experience with ML infrastructure, including model training, batch and online inference, and monitoring.
  • Strong knowledge of networking, storage, and security in large-scale systems.
  • Familiarity with workflow orchestration tools (e.g., Dagster, Airflow) and model-serving frameworks (e.g., Ray Serve, vLLM).
  • Experience optimizing performance and cost efficiency of AI workloads on cloud and on-prem environments.
Project Experience:
  • Proven track record of leading complex infrastructure projects from design to production.
  • Comfortable working on ambiguous and evolving projects, quickly identifying key challenges and driving solutions.
  • Experience in designing high-availability, fault-tolerant systems for AI/ML workloads.
  • Has worked on developer tooling, platform engineering, or ML infrastructure, ensuring AI teams can build and deploy efficiently.
  • Hands-on experience with monitoring, observability, and alerting for production systems.

We’re looking for individuals who thrive in horizontal, high-impact teams that build foundational infrastructure for multiple AI initiatives. People who enjoy solving deep technical challenges at the intersection of AI, cloud, and distributed systems, and who take ownership with a strong product mindset - ensuring infrastructure is reliable, scalable, and built around user needs. We value collaborators and mentors who help teammates grow while upholding high engineering standards. If you’re passionate about building scalable, efficient, and cost-effective AI infrastructure that drives meaningful, real-world impact, we’d love to meet you.

If you feel interested in these challenges and want to work on a very engaged and talented team, this is the place for you!

 

What we have to offer

  • High-Impact, Cross-Functional Work – Our team sits at the core of AI operations, enabling ML engineers, researchers, and data scientists to build and deploy models at scale. You'll work across multiple teams and business units, directly shaping AI-driven products and decisions.
  • Cutting-Edge AI & Cloud Infrastructure – Be part of a team that designs and operates high-performance AI infrastructure, spanning cloud, data, and ML platforms. You'll tackle technical challenges in distributed systems, model serving, and large-scale data processing.
  • 0 to 1 & Large-Scale Initiatives – Work on both greenfield projects and mission-critical AI infrastructure, from building scalable training pipelines to optimizing real-time inference workloads. Your work will directly influence the efficiency and scalability of AI across the company.
  • Growth & Ownership Opportunities – As a senior engineer, you'll have the autonomy to drive technical direction, lead high-impact projects, and contribute to architectural decisions. You'll also have opportunities to mentor others, shape engineering best practices, and grow into a leadership role.
  • Culture of Excellence & Collaboration – Join a team that values deep technical expertise, curiosity, and a strong engineering culture. We operate in a fast-moving environment where innovation, reliability, and efficiency drive everything we build.

 

Our Benefits

  • Remote work, with quarterly trips to Sao Paulo to build relationships with coworkers. 
  • Top Tier Medical Insurance
  • Top Tier Dental and Vision Insurance
  • 20 days time off, 14 company holidays, and great culture that emphasizes work life balance. 
  • Life Insurance and AD&D
  • Extended maternity and paternity leaves 
  • Nucleo - Our learning platform of courses
  • NuLanguage - Our language learning program
  • NuCare - Our mental health and wellness assistance program
  • Extended maternity and paternity leaves 
  • 401K
  • Saving Plans - Health Saving Account and Flexible Spending Account

Similar Jobs

Nordson Corporation - Operations Finance Manager

Nordson Corporation

Carlsbad, California, United States (On-Site)
2 Months ago
Capgemini - SAP Hana Rise Senior Consultant / Manager

Capgemini

Mumbai, Maharashtra, India (Hybrid)
1 Month ago
GOAT Group - Retention Marketing Manager

GOAT Group

United States (Remote)
1 Month ago
kaizen gaming  - Principal Data Scientist

kaizen gaming

(Remote)
2 Months ago
Illumina - Logistics Specialist

Illumina

Eindhoven, North Brabant, Netherlands (On-Site)
3 Weeks ago
Ramp - Staff Software Engineer | FedRAMP Infrastructure

Ramp

New York, United States (Hybrid)
3 Weeks ago
Intel  - Sr. Infrastructure Engineer

Intel

Phoenix, Arizona, United States (On-Site)
2 Months ago
PhonePe - Server Administrator (Devops and Linux)

PhonePe

Bengaluru, Karnataka, India (On-Site)
2 Months ago
Mistral AI - DevOps Engineer, HPC Services

Mistral AI

Paris, Île-de-France, France (Hybrid)
2 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Vibes  - Growth Specialist

Vibes

Delft, South Holland, Netherlands (On-Site)
3 Months ago
Miro - Technical Account Manager

Miro

London, England, United Kingdom (On-Site)
4 Weeks ago
CrazyLabs - QA Engineer

CrazyLabs

Skopje, Greater Skopje, North Macedonia (On-Site)
1 Month ago
IT Gurus Software - ETL Test Automation Engineer (ETL Tester)

IT Gurus Software

Pune, Maharashtra, India (On-Site)
10 Months ago
Paytm - Business Finance - Assistant Manager

Paytm

Noida, Uttar Pradesh, India (On-Site)
2 Months ago
Apple - Machine Learning Resource Management Engineer - SIML

Apple

Seattle, Washington, United States (On-Site)
1 Month ago
Rockstar Games - Marketing Manager

Rockstar Games

Seoul, South Korea (On-Site)
3 Months ago
Imanage - Product Manager (Compliance Analytics & Threat Manager)

Imanage

Chicago, Illinois, United States (Hybrid)
4 Months ago
Demandbase - Manager, Sales Development

Demandbase

Austin, Texas, United States (Hybrid)
3 Months ago
dbt Labs - Senior Software Engineer

dbt Labs

Romania (Remote)
1 Month ago

Get notifed when new similar jobs are uploaded

Jobs in United States

Blinkhealth - Marketing Program Manager

Blinkhealth

United States (On-Site)
1 Month ago
Lionbridge Games - Japanese Interpreter

Lionbridge Games

Los Angeles, California, United States (On-Site)
4 Months ago
PayPal - Risk & Compliance Oversight Analyst

PayPal

Omaha, Nebraska, United States (Hybrid)
1 Month ago
Saronic Technologies - Assistant Editor/Production Assistant

Saronic Technologies

Austin, Texas, United States (On-Site)
3 Weeks ago
Fire Hose Games - Marketing Director

Fire Hose Games

Boston, Massachusetts, United States (Remote)
1 Year ago
CRB workforce  - Instructional Designer

CRB workforce

Salt Lake City, Utah, United States (Remote)
3 Months ago
Blooming Health - Senior AI Engineer, Conversational AI & Agentic Systems

Blooming Health

New York, United States (Remote)
3 Weeks ago
Nintendo - UI Artist III

Nintendo

Redmond, Washington, United States (Hybrid)
6 Months ago
WebFX - Jr. Paid Social Media Marketing Specialist

WebFX

Harrisburg, Pennsylvania, United States (On-Site)
10 Months ago
Nintendo - Assistant Manager - Nintendo San Francisco Store

Nintendo

San Francisco, California, United States (On-Site)
11 Months ago

Get notifed when new similar jobs are uploaded

Devops Jobs

Hawkeye Innovations - DevOps Tech Lead

Hawkeye Innovations

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)
3 Months ago
Nagarro - Associate Staff Engineer, Cloud

Nagarro

Bengaluru, Karnataka, India (On-Site)
10 Months ago
London stock Exchange - Senior Cloud Engineer

London stock Exchange

St. Louis, Missouri, United States (On-Site)
3 Months ago
Workato - Senior Solutions Architect

Workato

Palo Alto, California, United States (On-Site)
1 Month ago
Ion - Senior DevSecOps Engineer, Italy

Ion

Pisa, Tuscany, Italy (On-Site)
10 Months ago
neural concept - Cloud Solutions Engineer (ML Platform)

neural concept

Jersey City, New Jersey, United States (Hybrid)
3 Weeks ago
Capgemini - Solution Architect

Capgemini

Bengaluru, Karnataka, India (On-Site)
3 Months ago
Arista Networks - Campus Technical Solutions Engineer

Arista Networks

Bengaluru, Karnataka, India (On-Site)
1 Month ago
Scopely - DevOps Lead

Scopely

Barcelona, Catalonia, Spain (Hybrid)
1 Month ago

Get notifed when new similar jobs are uploaded

About The Company

Nubank was born in 2013 with the mission to fight against the complexity of the financial market to help our customers regain control of their financial lives. We have spent 11 years dedicated to bringing very simple ideas to places no one has ever taken them. For us, past success does not guarantee the future, which is why every day is “Day 1.” Being part of Nubank is embarking on a long-term journey where we know each challenge sparks creativity and innovation, where obstacles become opportunities to go a little further. Recently, we reached the milestone of 100 million customers globally, a significant achievement in our journey, but we know it wasn’t just the customers who chose us. We have over 8,000 Nubankers who choose to work with us daily.

State Of São Paulo, Brazil (On-Site)

State Of São Paulo, Brazil (Hybrid)

State Of São Paulo, Brazil (Hybrid)

State Of São Paulo, Brazil (Hybrid)

Montevideo, Montevideo Department, Uruguay (Hybrid)

State Of São Paulo, Brazil (Hybrid)

Mexico City, Mexico (On-Site)

State Of São Paulo, Brazil (Hybrid)

State Of São Paulo, Brazil (On-Site)

Mexico City, Mexico (On-Site)

View All Jobs

Get notified when new jobs are added by nubank

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug