Software Engineering Manager, Site Reliability, Cloud Incident Response

8 Hours ago • 8-11 Years

About the job

SummaryBy Outscal

Must have:
  • Bachelor's degree or equivalent practical experience
  • 8 years of experience with software development
  • 3 years of experience in a technical leadership role
  • 2 years of experience in people management
Good to have:
  • Master's degree or PhD in Computer Science
  • Experience working in a changing organization
Not hearing back from companies?
Unlock the secrets to a successful job application and accelerate your journey to your next opportunity.

Minimum qualifications:

  • Bachelor's degree or equivalent practical experience.
  • 8 years of experience with software development in one or more programming languages (e.g., Python, C, C++, Java, JavaScript).
  • 3 years of experience in a technical leadership role; overseeing projects, with 2 years of experience in a people management, supervision/team leadership role.

Preferred qualifications:

  • Master's degree or PhD in Computer Science, or a related technical field.
  • Experience working in a changing organization.

About the job

Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that Google Cloud's services—both our internally critical and our externally-visible systems—have reliability, uptime appropriate to customer's needs and a fast rate of improvement. Additionally SRE’s will keep an ever-watchful eye on our systems capacity and performance.

Much of our software development focuses on optimizing existing systems, building infrastructure and eliminating work through automation. On the SRE team, you’ll have the opportunity to manage the complex challenges of scale which are unique to Google Cloud, while using your expertise in coding, algorithms, complexity analysis and large-scale system design. SRE's culture of diversity, intellectual curiosity, problem solving and openness is key to its success. Our organization brings together people with a wide variety of backgrounds, experiences and perspectives. We encourage them to collaborate, think big and take risks in a blame-free environment. We promote self-direction to work on meaningful projects, while we also strive to create an environment that provides the support and mentorship needed to learn and grow.

The team's mission is to create a dependable experience for GCP customers. In this role, you will be responding to and helping to coordinate, mitigate, or resolve major incidents across all of GCP. The Cloud Incident Response Team supports the responders, tooling, and outcomes for GCP Major Incidents. The team collaborates across GCP products, customer facing teams, and a wide range of stakeholders.

Google Cloud accelerates every organization’s ability to digitally transform its business and industry. We deliver enterprise-grade solutions that leverage Google’s cutting-edge technology, and tools that help developers build more sustainably. Customers in more than 200 countries and territories turn to Google Cloud as their trusted partner to enable growth and solve their most critical business problems.

Responsibilities

  • Participate in on-call rotation supporting Critical Incident Response for GCP.
  • Focus on high-quality customer outcomes and continuous collaboration across GCP teams.
  • Create IMAG training and processes for incident management life-cycle and partnering with Cloud SRE UTLs, and Cloud Support leadership team.
  • Build systems and tooling to support the team, improve visibility, detection of issues, communications to customers, stakeholders, and customer facing teams.
  • Define and escalate risks in Cloud, reduce incident probabilities with strategic and tactical/pragmatic approaches as needed.
View Full Job Description

About The Company

A problem isn't truly solved until it's solved for all. Googlers build products that help create opportunities for everyone, whether down the street or across the globe. Bring your insight, imagination and a healthy disregard for the impossible. Bring everything that makes you unique. Together, we can build for everyone.

View All Jobs

Similar Jobs

Meta - Software Engineering Manager, Product

California, United States (On-Site)

Meta - Software Engineering Manager, Product

California, United States (Remote)

Meta - Software Engineering Manager, Product

California, United States (Remote)

Meta - Software Engineering Manager, Product

New York, United States (Remote)

Meta - Software Engineering Manager, Product

Washington, United States (On-Site)

Meta - Software Engineering Manager, Product

California, United States (On-Site)

Meta - Software Engineering Manager, Product

California, United States (Remote)

Meta - Software Engineering Manager, Product

Washington, United States (On-Site)

Similar Skill Jobs

Aristocrat Gaming - Team Lead

Uttar Pradesh, India (Hybrid)

Aristocrat Gaming - BI Analyst

England, United Kingdom (Hybrid)

Meta - Software Engineering Manager, Product

California, United States (On-Site)

Meta - Software Engineering Manager, Product

California, United States (Remote)

Meta - Software Engineering Manager, Product

California, United States (Remote)

Meta - Software Engineering Manager, Product

New York, United States (Remote)

Meta - Software Engineering Manager, Product

Washington, United States (On-Site)

Meta - Software Engineering Manager, Product

California, United States (On-Site)

Meta - Software Engineering Manager, Product

California, United States (Remote)

Meta - Software Engineering Manager, Product

Washington, United States (On-Site)

Jobs in London, England, United Kingdom

Aristocrat Gaming - BI Analyst

England, United Kingdom (Hybrid)

Meta - Production Engineer

England, United Kingdom (On-Site)

Meta - Software Engineer, Product

England, United Kingdom (On-Site)

The Walt Disney Company - R&D Engineers - All Levels ILM London

England, United Kingdom (Hybrid)

The Walt Disney Company - Product Design and Development Manager

England, United Kingdom (Hybrid)

The Walt Disney Company - Cast Member - 16 Hours - Christmas Temporary

England, United Kingdom (On-Site)

The Walt Disney Company - Marketing Manager - Hardlines

England, United Kingdom (On-Site)

The Walt Disney Company - Regional & UK Creative Designer

England, United Kingdom (Hybrid)

The Walt Disney Company - Technical Support Specialist - ILM London

England, United Kingdom (On-Site)

Software Engineering Jobs

Aristocrat Gaming - Frontend Developer

Tel Aviv District, Israel (Hybrid)

Aristocrat Gaming - BI Analyst

England, United Kingdom (Hybrid)

Meta - Mechanical Engineer

Texas, United States (On-Site)

Meta - Software Engineering Manager, Product

California, United States (On-Site)

Meta - Software Engineering Manager, Product

California, United States (Remote)

Meta - Software Engineering Manager, Product

California, United States (Remote)

Meta - Software Engineering Manager, Product

New York, United States (Remote)

Meta - Software Engineering Manager, Product

Washington, United States (On-Site)

Meta - Software Engineering Manager, Product

California, United States (On-Site)

Meta - Software Engineering Manager, Product

California, United States (Remote)

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug