Principal Software Engineer, Google Compute Engine Control Plane

2 Hours ago • 15 Years + • DevOps

About the job

SummaryBy Outscal

Must have:
  • 15+ years experience with large scale distributed systems
  • Technical leadership & global project experience
  • Experience with cloud solutions architecture, development, maintenance
  • Customer-focused iterative product delivery
  • Networking and compute infrastructure expertise
Good to have:
  • Hyperscale cloud technology experience
  • Deep understanding of AI/ML infrastructure (GPUs, TPUs, LLMs)
  • AI/ML use case experience (training, inference, tuning)
Not hearing back from companies?
Unlock the secrets to a successful job application and accelerate your journey to your next opportunity.

Minimum qualifications:

  • 15 years of experience with large scale distributed systems and architectures.
  • Experience in technical leadership, leading global projects and setting technical direction for teams.
  • Experience with customer focused, iterative product and feature delivery.
  • Experience in networking, compute infrastructure, and architecting, developing, or maintaining cloud solutions.

Preferred qualifications:

  • Experience working on or with hyperscale cloud technologies.
  • Deep understanding of AI/ML-related infrastructure technologies (e.g., GPUs, TPUs, LLMs, foundational models) and use cases (e.g., training, inference, tuning etc.).

About the job

Google Compute Engine (GCE) is at the heart of the Google Cloud Platform (GCP). It underlies and powers almost every service (e.g., VMs, databases, data analytics, Kubernetes, AI/ML, batch, cloud functions, monitoring, alerting, etc.) that GCP offers.

As the Principal Software Engineer, you will lead the Compute organization in the ideation, design, and development of numerous simultaneously executed cutting-edge projects and initiatives.

Google Cloud accelerates every organization’s ability to digitally transform its business and industry. We deliver enterprise-grade solutions that leverage Google’s cutting-edge technology, and tools that help developers build more sustainably. Customers in more than 200 countries and territories turn to Google Cloud as their trusted partner to enable growth and solve their most critical business problems.

Responsibilities

  • Develop new easy-to-use AI/ML related offerings leveraging Google’s software stack.
  • Design capacity-aware scheduling capabilities to automatically move workloads between zones and regions.
  • Drive key architectural decisions to ensure reliability, security, performance, and scalability.
  • Drive key implementation decisions to maximize code reuse, leveraging existing frameworks and minimizing accumulation of technical debt.
  • Ensure that APIs and semantics are modular, future proof, and compatible with other parts of GCE and GCP to ensure a consistent user experience.
View Full Job Description

About The Company

A problem isn't truly solved until it's solved for all. Googlers build products that help create opportunities for everyone, whether down the street or across the globe. Bring your insight, imagination and a healthy disregard for the impossible. Bring everything that makes you unique. Together, we can build for everyone.

View All Jobs

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug