Principal Software Engineer, Google Compute Engine Control Plane

5 Months ago • 15 Years + • DevOps

Job Summary

Job Description

Google Compute Engine (GCE) is at the heart of the Google Cloud Platform (GCP). As Principal Software Engineer, you'll lead the Compute organization in designing and developing cutting-edge projects. Responsibilities include developing new AI/ML offerings, designing capacity-aware scheduling, driving architectural decisions for reliability and scalability, ensuring API modularity and compatibility, and maximizing code reuse. The role requires leadership in global projects, setting technical direction, and delivering customer-focused products. This position involves working on the core infrastructure of GCP, impacting nearly every service offered.
Must have:
  • 15+ years experience with large scale distributed systems
  • Technical leadership & global project experience
  • Experience with cloud solutions architecture, development, maintenance
  • Customer-focused iterative product delivery
  • Networking and compute infrastructure expertise
Good to have:
  • Hyperscale cloud technology experience
  • Deep understanding of AI/ML infrastructure (GPUs, TPUs, LLMs)
  • AI/ML use case experience (training, inference, tuning)

Job Details

Minimum qualifications:

  • 15 years of experience with large scale distributed systems and architectures.
  • Experience in technical leadership, leading global projects and setting technical direction for teams.
  • Experience with customer focused, iterative product and feature delivery.
  • Experience in networking, compute infrastructure, and architecting, developing, or maintaining cloud solutions.

Preferred qualifications:

  • Experience working on or with hyperscale cloud technologies.
  • Deep understanding of AI/ML-related infrastructure technologies (e.g., GPUs, TPUs, LLMs, foundational models) and use cases (e.g., training, inference, tuning etc.).

About the job

Google Compute Engine (GCE) is at the heart of the Google Cloud Platform (GCP). It underlies and powers almost every service (e.g., VMs, databases, data analytics, Kubernetes, AI/ML, batch, cloud functions, monitoring, alerting, etc.) that GCP offers.

As the Principal Software Engineer, you will lead the Compute organization in the ideation, design, and development of numerous simultaneously executed cutting-edge projects and initiatives.

Google Cloud accelerates every organization’s ability to digitally transform its business and industry. We deliver enterprise-grade solutions that leverage Google’s cutting-edge technology, and tools that help developers build more sustainably. Customers in more than 200 countries and territories turn to Google Cloud as their trusted partner to enable growth and solve their most critical business problems.

Responsibilities

  • Develop new easy-to-use AI/ML related offerings leveraging Google’s software stack.
  • Design capacity-aware scheduling capabilities to automatically move workloads between zones and regions.
  • Drive key architectural decisions to ensure reliability, security, performance, and scalability.
  • Drive key implementation decisions to maximize code reuse, leveraging existing frameworks and minimizing accumulation of technical debt.
  • Ensure that APIs and semantics are modular, future proof, and compatible with other parts of GCE and GCP to ensure a consistent user experience.

Similar Jobs

NVIDIA - Senior Site Reliability Engineer - AI Research Clusters

NVIDIA

Santa Clara, California, United States (Hybrid)
3 Months ago
Warner Bros Games - Staff Software Engineer - MSC Rights Team

Warner Bros Games

Bengaluru, Karnataka, India (Hybrid)
2 Months ago
Casumo - Engineering Team Lead

Casumo

(Hybrid)
1 Month ago
My Fitness Pal - Staff Machine Learning Engineer

My Fitness Pal

United States (Remote)
3 Months ago
Canva - Staff Backend Engineer - Product Insights Enablement - Java

Canva

Sydney, New South Wales, Australia (Remote)
2 Months ago
Trend Micro - Sr. Engineer

Trend Micro

Taipei City, Taiwan (On-Site)
7 Months ago
ARHS - DevSecOps Engineer (Automation Specialist)

ARHS

The Hague, South Holland, Netherlands (On-Site)
6 Months ago
Codeninja - Azure Engineer

Codeninja

Mexico (Remote)
1 Month ago
PwC - IN_Associate_Azure Cloud Data Engineer_OneCloud _Advisory _Bangalore

PwC

Gurugram, Haryana, India (On-Site)
5 Months ago
Ubisoft - Monitoring Specialist - Golang Developer

Ubisoft

Saint-Mandé, Île-de-France, France (Hybrid)
2 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Gaming Innovation Group  - Lead Backend Engineer

Gaming Innovation Group

Community Of Madrid, Spain (Hybrid)
1 Month ago
Veeam Software - Customer Success Technical Onboarding Manager (Post Sales)

Veeam Software

Kuala Lumpur, Federal Territory Of Kuala Lumpur, Malaysia (On-Site)
6 Months ago
Postman - Backend and Systems Engineer, Flows

Postman

New York, New York, United States (On-Site)
6 Months ago
CloudHire - Full Stack Developer - Angular & NestJS

CloudHire

India (Remote)
1 Month ago
Epic Games - Backend Security Programmer (Resource Integrity)

Epic Games

Montreal, Quebec, Canada (On-Site)
1 Month ago
Ness Digital - Senior .Net Full-stack Engineer

Ness Digital

Timișoara, Timiș, Romania (Remote)
1 Month ago
Smarsh - (Principal Engineer)Cloud Application Developer

Smarsh

India (Hybrid)
6 Months ago
Canva - Senior Computer Vision Engineer - Photo AI

Canva

Vienna, Vienna, Austria (Remote)
3 Months ago
Fortis Games - Senior Cloud Security Engineer

Fortis Games

Hungary (On-Site)
2 Months ago
Hashlist - Senior Data Engineer

Hashlist

Pune, Maharashtra, India (Hybrid)
5 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Worldwide

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

DevOps Jobs

Escape Velocity Entertainment - Site Reliability Engineer

Escape Velocity Entertainment

(Remote)
1 Month ago
Axon - Manager, Site Reliability Engineering

Axon

Seattle, Washington, United States (Remote)
2 Months ago
Brillio - Java Full Stack Architect - R01536819

Brillio

Bengaluru, Karnataka, India (Hybrid)
6 Months ago
Nielsen Holdings - Software Engineer (Java/Scala, SQL, AWS, Spark on Kubernetes)

Nielsen Holdings

Bengaluru, Karnataka, India (Hybrid)
6 Months ago
CloudHire - Senior Cloud AWS Engineer

CloudHire

Karnataka, India (Remote)
1 Month ago
The Walt Disney Company - Senior Platform Engineer - Identity & Access Management

The Walt Disney Company

Orlando, Florida, United States (On-Site)
1 Month ago
Amanotes - Site Reliability Engineer (SRE/DevOps)

Amanotes

Ho Chi Minh City, Ho Chi Minh City, Vietnam (On-Site)
9 Months ago
NVIDIA - Senior Software Configuration Management Engineer

NVIDIA

Bengaluru, Karnataka, India (Hybrid)
1 Month ago
NVIDIA - Senior Site Reliability Engineer, Data Science and ML Platforms

NVIDIA

Shanghai, Shanghai, China (On-Site)
3 Months ago
ByteDance - Site Reliability Engineer, Traffic Platform - 2025 Start

ByteDance

Singapore (On-Site)
6 Months ago

Get notifed when new similar jobs are uploaded

About The Company

A problem isn't truly solved until it's solved for all. Googlers build products that help create opportunities for everyone, whether down the street or across the globe. Bring your insight, imagination and a healthy disregard for the impossible. Bring everything that makes you unique. Together, we can build for everyone.

Mountain View, California, United States (On-Site)

Mountain View, California, United States (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

View All Jobs

Get notified when new jobs are added by Google

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug