Software Engineer, Site Reliability Engineering

5 Months ago • 2 Years + • DevOps • Backend Development

Job Summary

Job Description

Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that Google Cloud's services—both our internally critical and our externally-visible systems—have reliability, uptime appropriate to customer's needs and a fast rate of improvement. Additionally SRE’s will keep an ever-watchful eye on our systems capacity and performance. Much of our software development focuses on optimizing existing systems, building infrastructure and eliminating work through automation. On the SRE team, you will have the opportunity to manage the complex challenges of scale which are unique to Google Cloud, while using your expertise in coding, algorithms, complexity analysis and large-scale system design. SRE's culture of diversity, intellectual curiosity, problem solving and openness is key to its success. Our organization brings together people with a wide variety of backgrounds, experiences and perspectives. We encourage them to collaborate, think big and take risks in a blame-free environment. We promote self-direction to work on meaningful projects, while we also strive to create an environment that provides the support and mentorship needed to learn and grow. Behind everything our users see online is the architecture built by the Technical Infrastructure team to keep it running. From developing and maintaining our data centers to building the next generation of Google platforms, we make Google's product portfolio possible. We're proud to be our engineers' engineers and love voiding warranties by taking things apart so we can rebuild them. We keep our networks up and running, ensuring our users have the best and fastest experience possible. Responsibilities Improve the reliability, scalability and efficiency of our distributed data query engine, data warehouse and data analytics platforms. Improve the reliability and performance of YouTube payment and business systems. Propose and review designs impacting YouTube's data infrastructure. Advocate for reliability methodologies within YouTube Data organization, by actively engaging in postmortem reviews, Incident Management at Google (IMAG) trainings, supporting the teams during outages.
Must have:
  • Bachelor’s degree in Computer Science, a related field, or equivalent practical experience.
  • 2 years of experience with data structures/algorithms and software development in one or more programming languages such as Python, Java, C, C++ or Go.
  • Improve the reliability, scalability and efficiency of our distributed data query engine, data warehouse and data analytics platforms.
  • Improve the reliability and performance of YouTube payment and business systems.
Good to have:
  • Experience working in computing, distributed systems, storage, or networking.
  • Expertise in designing, analyzing, and troubleshooting large-scale distributed systems.
  • Ability to debug code and to automate routine tasks.
  • Excellent problem-solving approach, along with effective communication skills.
  • Propose and review designs impacting YouTube's data infrastructure.
  • Advocate for reliability methodologies within YouTube Data organization, by actively engaging in postmortem reviews, Incident Management at Google (IMAG) trainings, supporting the teams during outages.

Job Details

Minimum qualifications:

  • Bachelor’s degree in Computer Science, a related field, or equivalent practical experience.
  • 2 years of experience with data structures/algorithms and software development in one or more programming languages such as Python, Java, C, C++ or Go.

Preferred qualifications:

  • Experience working in computing, distributed systems, storage, or networking.
  • Expertise in designing, analyzing, and troubleshooting large-scale distributed systems.
  • Ability to debug code and to automate routine tasks.
  • Excellent problem-solving approach, along with effective communication skills.

About the job

Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that Google Cloud's services—both our internally critical and our externally-visible systems—have reliability, uptime appropriate to customer's needs and a fast rate of improvement. Additionally SRE’s will keep an ever-watchful eye on our systems capacity and performance.

Much of our software development focuses on optimizing existing systems, building infrastructure and eliminating work through automation. On the SRE team, you will have the opportunity to manage the complex challenges of scale which are unique to Google Cloud, while using your expertise in coding, algorithms, complexity analysis and large-scale system design. SRE's culture of diversity, intellectual curiosity, problem solving and openness is key to its success. Our organization brings together people with a wide variety of backgrounds, experiences and perspectives. We encourage them to collaborate, think big and take risks in a blame-free environment. We promote self-direction to work on meaningful projects, while we also strive to create an environment that provides the support and mentorship needed to learn and grow.

Behind everything our users see online is the architecture built by the Technical Infrastructure team to keep it running. From developing and maintaining our data centers to building the next generation of Google platforms, we make Google's product portfolio possible. We're proud to be our engineers' engineers and love voiding warranties by taking things apart so we can rebuild them. We keep our networks up and running, ensuring our users have the best and fastest experience possible.

Responsibilities

  • Improve the reliability, scalability and efficiency of our distributed data query engine, data warehouse and data analytics platforms.
  • Improve the reliability and performance of YouTube payment and business systems.
  • Propose and review designs impacting YouTube's data infrastructure.
  • Advocate for reliability methodologies within YouTube Data organization, by actively engaging in postmortem reviews, Incident Management at Google (IMAG) trainings, supporting the teams during outages.

Similar Jobs

Google - Software Engineer II, Front-End, Google Cloud Networking

Google

Warsaw, Masovian Voivodeship, Poland (On-Site)
5 Months ago
Assystems - Senior Software Engineer

Assystems

Gurugram, Haryana, India (On-Site)
6 Months ago
Evernorth Health Services - Automation Engineering Associate Advisor [T500-13512]

Evernorth Health Services

Hyderabad, Telangana, India (On-Site)
7 Months ago
Nielsen - Senior Software Engineer - Bigdata ( Java / Scala / Python  & Spark , SQL , AWS).

Nielsen

Bengaluru, Karnataka, India (Hybrid)
6 Months ago
Google - Industry Solutions Lead II, Retail, Google Cloud

Google

Atlanta, Georgia, United States (On-Site)
5 Months ago
paypal - Software Engineer

paypal

Scottsdale, Arizona, United States (Hybrid)
7 Months ago
GoReel - Junior OnCall Site Reliability Engineer (SRE)

GoReel

Bratislava, Bratislava Region, Slovakia (Remote)
9 Months ago
Tencent - Senior Microsoft 365 Architect – Azure & Office 365

Tencent

British Columbia, Canada (On-Site)
6 Months ago
GoTo Group - Site Reliability Engineer - EP (SE4)

GoTo Group

Bengaluru, Karnataka, India (On-Site)
6 Months ago
Visa - Staff Systems Engineer - Splunk Administrator - PRE

Visa

Austin, Texas, United States (Hybrid)
6 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Salesforce - Lead Software Engineer/LMTS - Backend - Hyderabad

Salesforce

Hyderabad, Telangana, India (On-Site)
6 Months ago
canva - Software Engineer (Java) - Growth - (Remote across ANZ)

canva

Sydney, New South Wales, Australia (Remote)
6 Months ago
ness - Sr Technical Application Architect

ness

Hyderabad, Telangana, India (On-Site)
5 Months ago
Meta - Software Engineer (Technical Leadership) - Machine Learning

Meta

Bellevue, Washington, United States (On-Site)
5 Months ago
PwC - IN-Manager _Technical Delivery Manager_ Emerging Technologies_ Advisory_ Bengaluru

PwC

Bengaluru, Karnataka, India (On-Site)
7 Months ago
Starkflow - Java Technical Lead

Starkflow

India (On-Site)
5 Months ago
Cloudhire - Database Solution Architect

Cloudhire

Mumbai, Maharashtra, India (Remote)
6 Months ago
ByteDance - Site Reliability Engineer (Traffic), Infrastructure Engineering

ByteDance

Singapore (On-Site)
5 Months ago
Cloudhire - Backend Developer

Cloudhire

Hyderabad, Telangana, India (Remote)
6 Months ago
Visa - Lead Software Engineer - Senior Staff

Visa

Warsaw, Masovian Voivodeship, Poland (Hybrid)
6 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Zürich, Zurich, Switzerland

Dun & Bradstreet - Integrated Marketing Manager (R-16731)

Dun & Bradstreet

Urdorf, Zurich, Switzerland (Hybrid)
6 Months ago
PwC - Cyber Cloud Security Lead

PwC

Zürich, Zurich, Switzerland (On-Site)
6 Months ago
eBay - Intern Data Privacy EMEA (f/m/d)

eBay

Bern, Canton Of Bern, Switzerland (Hybrid)
7 Months ago
PwC - Senior Associate - SAP Global Trade Services

PwC

Zürich, Zurich, Switzerland (On-Site)
6 Months ago
Google - Software Engineer, Machine Learning, Gemini

Google

Zürich, Zurich, Switzerland (On-Site)
5 Months ago
Sonar Source - Java Backend Developer

Sonar Source

Geneva, Geneva, Switzerland (On-Site)
6 Months ago
Niantic - Senior Security Engineer, Detection and Response

Niantic

Zürich, Zurich, Switzerland (Hybrid)
7 Months ago
PwC - Audit Freelancer/Contractor

PwC

Zürich, Zurich, Switzerland (On-Site)
7 Months ago
Google - Technical Program Manager, Gemini

Google

Zürich, Zurich, Switzerland (On-Site)
5 Months ago

Get notifed when new similar jobs are uploaded

DevOps Jobs

Next Level Business Services - Systems Engineer

Next Level Business Services

Redmond, Washington, United States (On-Site)
6 Months ago
Rackspace - AWS Cloud Practice Engineer V

Rackspace

Mexico City, Mexico (Remote)
6 Months ago
Google - Cloud Engineer, Data

Google

Hyderabad, Telangana, India (On-Site)
5 Months ago
Netskope - Staff Engineer, IPSec & GRE

Netskope

Bengaluru, Karnataka, India (Remote)
7 Months ago
Google - API Management Engineer

Google

Bengaluru, Karnataka, India (On-Site)
5 Months ago
Witzeal - DevOps Engineer

Witzeal

Gurugram, Haryana, India (On-Site)
7 Months ago
Tencent - Cloud Engineer

Tencent

Singapore (On-Site)
6 Months ago
Nielsen - DevOps Engineer (Terraform, Jenkins, GitLab CI/CD, Python, Airflow)

Nielsen

Bengaluru, Karnataka, India (Hybrid)
6 Months ago
The Workshop - Platform (Devops) Engineer - Blockchain

The Workshop

Madrid, Community Of Madrid, Spain (On-Site)
5 Months ago
PwC - Senior Associate _ Automation Tester_ Emerging  Technologies_ Advisory_ Bengaluru

PwC

Bengaluru, Karnataka, India (On-Site)
7 Months ago

Get notifed when new similar jobs are uploaded

About The Company

A problem isn't truly solved until it's solved for all. Googlers build products that help create opportunities for everyone, whether down the street or across the globe. Bring your insight, imagination and a healthy disregard for the impossible. Bring everything that makes you unique. Together, we can build for everyone.

Dublin, County Dublin, Ireland (On-Site)

New York, New York, United States (On-Site)

Waterloo, Ontario, Canada (On-Site)

Taipei City, Taiwan (On-Site)

San Francisco, California, United States (On-Site)

Saint-Ghislain, Wallonia, Belgium (On-Site)

Bengaluru, Karnataka, India (On-Site)

Austin, Texas, United States (On-Site)

View All Jobs

Get notified when new jobs are added by Google

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug