Software Engineer III, AI/ML, Technical Infrastructure

9 Hours ago • 2-4 Years • Artificial Intelligence

Job Summary

Job Description

This Software Engineer III role at Google focuses on AI/ML technical infrastructure. Responsibilities include measuring and optimizing model performance on Google Cloud, identifying and resolving performance bottlenecks, developing training materials, contributing to product improvement through code development and testing, and conducting performance profiling and debugging. The ideal candidate possesses strong software development skills, expertise in data structures and algorithms, and experience with ML infrastructure. The role involves collaboration with internal teams and ensuring adherence to best practices. The position is within Google's MSCA organization, impacting Google services and Google Cloud customers.
Must have:
  • Bachelor's degree or equivalent experience
  • 2+ years software development experience
  • 2+ years experience with data structures/algorithms
  • Experience with ML infrastructure/performance
  • Proficient in Google Cloud Platform
Good to have:
  • Master's or PhD in Computer Science
  • 4+ years software development experience

Job Details


Minimum qualifications:

  • Bachelor’s degree or equivalent practical experience.
  • 2 years of experience with software development in one or more programming languages, or 1 year of experience with an advanced degree.
  • 2 years of experience with data structures or algorithms.

Preferred qualifications:

  • Master's degree or PhD in Computer Science or related technical fields.
  • 4 years of experience in software development using one or more programming languages, with expertise in data structures and algorithms.
  • 1 year of experience with ML infrastructure or performance.
  • Proficient in cloud services such as Compute, Storage, and Networking, particularly on Google Cloud Platform.

About the job

Google's software engineers develop the next-generation technologies that change how billions of users connect, explore, and interact with information and one another. Our products need to handle information at massive scale, and extend well beyond web search. We're looking for engineers who bring fresh ideas from all areas, including information retrieval, distributed computing, large-scale system design, networking and data storage, security, artificial intelligence, natural language processing, UI design and mobile; the list goes on and is growing every day. As a software engineer, you will work on a specific project critical to Google’s needs with opportunities to switch teams and projects as you and our fast-paced business grow and evolve. We need our engineers to be versatile, display leadership qualities and be enthusiastic to take on new problems across the full-stack as we continue to push technology forward.

The ML, Systems, & Cloud AI (MSCA) organization at Google designs, implements, and manages the hardware, software, machine learning, and systems infrastructure for all Google services (Search, YouTube, etc.) and Google Cloud. Our end users are Googlers, Cloud customers and the billions of people who use Google services around the world.

We prioritize security, efficiency, and reliability across everything we do - from developing our latest TPUs to running a global network, while driving towards shaping the future of hyperscale computing. Our global impact spans software and hardware, including Google Cloud’s Vertex AI, the leading AI platform for bringing Gemini models to enterprise customers.

Responsibilities

  • Measure and optimize AI/ML model performance on Google Cloud infrastructure.
  • Identify and resolve performance bottlenecks, collaborating with internal infrastructure teams to enhance support for demanding AI workloads as needed.
  • Develop and deliver high-quality training and demos for both customers and internal teams.
  • Contribute to ongoing product improvement by identifying bugs, recommending enhancements, and writing and testing production-quality code.
  • Conduct in-depth performance profiling, debugging, and troubleshooting of training and inference workloads, ensuring adherence to best practices through design and code reviews.

Similar Jobs

NVIDIA - Senior Deep Learning Engineer

NVIDIA

Redmond, Washington, United States (On-Site)
2 Weeks ago
ByteDance - Backend Software Engineer - Privacy & Security - Singapore

ByteDance

Singapore (On-Site)
5 Months ago
ByteDance - Research Scientist Graduate (Foundation Model - Generative AI) - 2025 Start (PhD)

ByteDance

Seattle, Washington, United States (On-Site)
4 Months ago
NVIDIA - Senior Application Software Engineer, Performance

NVIDIA

Shanghai, Shanghai, China (On-Site)
2 Weeks ago
Google - Senior Software Engineer, Android, Find My Device

Google

Bengaluru, Karnataka, India (On-Site)
10 Hours ago
Zoox - Senior/Staff Machine Learning Engineer - Prediction & Behavior ML

Zoox

Foster City, California, United States (Hybrid)
6 Months ago
Google - Open Career Opportunities, Autonomous (Self-Driving) Vehicle Jobs, Waymo

Google

Phoenix, Arizona, United States (On-Site)
5 Months ago
AI Fund - Curriculum Developer

AI Fund

Germany (Remote)
6 Months ago
Tencent - Senior Researcher: Artificial General Intelligence (Natural Language Processing)

Tencent

Washington, United States (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Google - Senior Staff Software Engineer, Infrastructure, Google Cloud Security and Privacy

Google

New York, New York, United States (On-Site)
11 Hours ago
ION - Senior AI Engineer, Italy

ION

Pisa, Tuscany, Italy (On-Site)
6 Months ago
Google - Software Engineer, Android and Chrome OS

Google

Kraków, Lesser Poland Voivodeship, Poland (On-Site)
9 Hours ago
Index Exchange - Software Development Engineer in Test (SDET)

Index Exchange

Bengaluru, Karnataka, India (Remote)
6 Months ago
Arrise Solutions (India)   - Data Scientist - Recommender S/m's

Arrise Solutions (India)

Hyderabad, Telangana, India (On-Site)
6 Months ago
Luxoft - Senior ML Engineer

Luxoft

Poland, Ohio, United States (Remote)
3 Months ago
Microsoft - Research Intern - Microsoft Teams CMD Labs

Microsoft

Redmond, Washington, United States (On-Site)
19 Hours ago
Google - Technical Lead Software Engineer, Kubernetes Networking, Google Cloud

Google

Sunnyvale, California, United States (On-Site)
11 Hours ago
ByteDance - Research Scientist Graduate (Foundation Model - Vision and Language)

ByteDance

Seattle, Washington, United States (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded

Jobs in New Taipei, New Taipei City, Taiwan

Google - Software Engineer III, Diagnostics, Tools, Google Cloud Platform

Google

Taipei City, Taiwan (On-Site)
8 Hours ago
Google - Test Engineer, Pixel Software

Google

New Taipei, New Taipei City, Taiwan (On-Site)
9 Hours ago
Trend Micro - Large Language Models (LLM) Expert (VicOne_Automotive Security)

Trend Micro

Taipei City, Taiwan (On-Site)
6 Months ago
Google - TPU Architect, Silicon

Google

New Taipei, New Taipei City, Taiwan (On-Site)
10 Hours ago
Google - Silicon Quality and Reliability Engineer

Google

Taipei City, Taiwan (On-Site)
9 Hours ago
Rivos - SOC Physical Design - Full time

Rivos

Hsinchu, Hsinchu City, Taiwan (Hybrid)
6 Months ago
NVIDIA - Enterprise Software Test Development Engineer

NVIDIA

Taipei City, Taiwan (On-Site)
2 Weeks ago
Google - Software Engineer III, Pixel Connectivity

Google

New Taipei, New Taipei City, Taiwan (On-Site)
8 Hours ago
Garena - Garena - Database Administrator

Garena

Taipei City, Taiwan (On-Site)
2 Days ago

Get notifed when new similar jobs are uploaded

Artificial Intelligence Jobs

Codeway - AI Product Engineer

Codeway

İstanbul, Türkiye (On-Site)
2 Months ago
AI Fund - Curriculum Developer

AI Fund

(Remote)
6 Months ago
ByteDance - Research Engineer- Foundation Model AI Platform- San Jose

ByteDance

San Jose, California, United States (On-Site)
5 Months ago
Google - Customer Engineer III, Applied AI, Google Cloud

Google

Mexico City, Mexico City, Mexico (On-Site)
11 Hours ago
Google - Senior Software Engineer, Distributed Machine Learning

Google

Mountain View, California, United States (On-Site)
8 Hours ago
Tencent - Senior Researcher, Natural Language Processing

Tencent

(On-Site)
1 Month ago
ByteDance - Student Researcher (Doubao (Seed) - Generative AI)

ByteDance

San Jose, California, United States (Hybrid)
2 Days ago
Google - Software Engineer III, AI/ML, Google Cloud AI

Google

Sunnyvale, California, United States (On-Site)
11 Hours ago
Microsoft - Member of Technical Staff, AI

Microsoft

Mountain View, California, United States (On-Site)
3 Weeks ago
Google - Cloud Product Strategy and Operations Lead

Google

Kirkland, Washington, United States (On-Site)
10 Hours ago

Get notifed when new similar jobs are uploaded

About The Company

A problem isn't truly solved until it's solved for all. Googlers build products that help create opportunities for everyone, whether down the street or across the globe. Bring your insight, imagination and a healthy disregard for the impossible. Bring everything that makes you unique. Together, we can build for everyone.

Bucharest, Bucharest, Romania (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Dublin, County Dublin, Ireland (On-Site)

Bengaluru, Karnataka, India (On-Site)

Sunnyvale, California, United States (On-Site)

Sunnyvale, California, United States (On-Site)

View All Jobs

Get notified when new jobs are added by Google

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug