Staff Site Reliability Engineer

undefined ago • 8 Years + • Devops • $137,500 PA - $236,500 PA

Job Summary

Job Description

The Cloud Infrastructure and Site Reliability Engineering team at is responsible for ensuring high availability, performance and scalability of critical systems powering Shopping/Honey’s business. This role involves collaborating with cross-functional teams, automating operational processes, and driving reliability best practices to improve end-to-end system architecture and reduce incidents. The Staff Site Reliability Engineer identifies issues, recommends best practices, leads functional projects, analyzes business trends, and contributes to process improvements while guiding junior engineers. Day-to-day tasks include owning service reliability across cloud regions, fostering a DevOps culture, leading containerization on GKE, managing infrastructure with Terraform, enhancing alerting, participating in on-call rotations, and collaborating with product and engineering teams.
Must have:
  • Manage and deliver large-scale reliability improvement projects.
  • Drive identification and optimization of performance bottlenecks.
  • Architect and implement scalable infrastructure solutions.
  • Lead design and enhancement of monitoring frameworks.
  • Improve system resilience and implement disaster recovery strategies.
  • Lead capacity planning initiatives.
  • Collaborate with development, operations, and technical teams.
  • Act as a technical mentor.
  • Define and execute long-term reliability engineering strategies.
  • Develop and enforce operational excellence best practices.
  • Own and enhance reliability of services across cloud regions.
  • Foster and advocate for a DevOps culture.
  • Lead containerization, deployment, and scaling on GKE.
  • Set up and manage cloud infrastructure using Terraform.
  • Enhance and automate alerting, incident detection, and recovery.
  • Participate in on-call rotation to meet business SLAs.
  • Work closely with teams in Agile Scrum and Kanban workflows.
  • Perform impact analysis and root cause analysis during incidents.
  • Champion a service-first mindset, addressing team needs.
Good to have:
  • 8+ years in Cloud Infrastructure, SRE, or DevOps Engineering.
  • B.S. or M.S. degree in Computer Science, Engineering, or a related technical field.
  • 4+ years hands-on experience with GKE and Harness in public/private clouds (AWS, GCP, Azure).
  • 4+ years hands-on experience with Infrastructure-as-code (Terraform, CloudFormation).
  • 4+ years hands-on experience with CI/CD pipelines (CircleCI, Harness, Jenkins, ArgoCD).
  • Experience in Node, Python, or Go.
  • Strong understanding of Google Cloud Logging, DataDog, or other monitoring and observability tools.
  • Ability to effectively diagnose and resolve performance bottlenecks within GCP at the infrastructure and application layers.
  • Strong leadership abilities.
  • Customer focus and commitment to quality.
  • Great interpersonal skills; solid communication skills, written and verbal.
  • Ability to remain composed, methodical, and think fast in a high-pressure environment.
  • Experience in managing, collaborating, and influencing global teams.
  • Organized, detail-oriented, and able to manage multiple tasks simultaneously with the ability to appropriately prioritize.
Perks:
  • Flexible work environment
  • Employee shares options
  • Health insurance
  • Life insurance
  • Medical benefits
  • Dental benefits
  • Vision benefits
  • Financial health support
  • Physical health support
  • Mental health support

Job Details

The Company

has been revolutionizing commerce globally for more than 25 years. Creating innovative experiences that make moving money, selling, and shopping simple, personalized, and secure, empowers consumers and businesses in approximately 200 markets to join and thrive in the global economy.

We operate a global, two-sided network at scale that connects hundreds of millions of merchants and consumers. We help merchants and consumers connect, transact, and complete payments, whether they are online or in person. is more than a connection to third-party payment networks. We provide proprietary payment solutions accepted by merchants that enable the completion of payments on our platform on behalf of our customers.

We offer our customers the flexibility to use their accounts to purchase and receive payments for goods and services, as well as the ability to transfer and withdraw funds. We enable consumers to exchange funds more safely with merchants using a variety of funding sources, which may include a bank account, a or Venmo account balance, and Venmo branded credit products, a credit card, a debit card, certain cryptocurrencies, or other stored value products such as gift cards, and eligible credit card rewards. Our , Venmo, and Xoom products also make it safer and simpler for friends and family to transfer funds to each other. We offer merchants an end-to-end payments solution that provides authorization and settlement capabilities, as well as instant access to funds and payouts. We also help merchants connect with their customers, process exchanges and returns, and manage risk. We enable consumers to engage in cross-border shopping and merchants to extend their global reach while reducing the complexity and friction involved in enabling cross-border trade.

Our beliefs are the foundation for how we conduct business every day. We live each day guided by our core values of Inclusion, Innovation, Collaboration, and Wellness. Together, our values ensure that we work together as one global team with our customers at the center of everything we do – and they push us to ensure we take care of ourselves, each other, and our communities.

Job Summary:

The Cloud Infrastructure and Site Reliability Engineering team at is responsible for ensuring high availability, performance and scalability of critical systems powering Shopping/Honey’s business. We collaborate with cross-functional teams, automate operational processes, and drive reliability best practices to improve end-to-end system architecture and reduce incidents. This job identifies issues and recommends best practices to enhance system reliability. They lead functional projects, analyze business trends, and contribute to process improvements while providing guidance to junior engineers.

Job Description:

Essential Responsibilities:

  • Manage and deliver large-scale reliability improvement projects, ensuring systems are performant, available, and resilient.
  • Drive the identification of performance bottlenecks and lead initiatives to optimize and scale critical systems and services.
  • Architect and implement scalable infrastructure solutions to support growing user demands while maintaining system reliability.
  • Lead the design and enhancement of monitoring frameworks, ensuring systems are highly observable, and support the response to production incidents.
  • Take ownership of improving system resilience by designing fault-tolerant architectures and implementing disaster recovery strategies.
  • Lead capacity planning initiatives to ensure system resources are proactively managed, preventing downtime or performance degradation under high load.
  • Work closely with development, operations, and other technical teams to ensure seamless system integration and align on best practices for reliability.
  • Act as a technical mentor within the organization, guiding teams through complex reliability challenges and promoting a culture of excellence.
  • Help define and execute long-term reliability engineering strategies and standards to ensure the scalability and performance of core services.
  • Develop and enforce best practices for operational excellence, including automation, incident management, and system monitoring, across engineering teams.

Minimum Qualifications:

  • Minimum of 8 years of relevant work experience and a Bachelor's degree or equivalent experience.

Preferred Qualification:

  • 8+ years in Cloud Infrastructure, Site Reliability Engineering (SRE), DevOps Engineering, or related fields
  • B.S. or M.S. degree in Computer Science, Engineering, or a related technical field, or equivalent experience may be considered in lieu of degree.
  • At least 4+ years of hands-on experience deploying, managing, and optimizing containerized applications using GKE, and Harness in both public and private cloud environments (AWS, GCP, Azure, etc.), preferably Google Cloud Platform (GCP).
  • 4+ years of hands-on experience with Infrastructure-as-code (Terraform, CloudFormation), CI/CD pipelines (CircleCI, Harness, Jenkins, ArgoCD), and experience in Node, Python, or Go.
  • Strong understanding of using Google Cloud Logging, DataDog, or other monitoring and observability tools.
  • Ability to effectively diagnose and resolve performance bottlenecks within GCP at the infrastructure and application layers.
  • Strong leadership abilities; must havecustomer focus and commitment to quality.
  • Must have great interpersonal skills; solid communication skills, written and verbal.
  • Ability to remain composed, methodical, and think fast in a high-pressure environment.
  • Experience in managing, collaborating, and influencing global teams.
  • Must be organized, detail-oriented, and able to manage multiple tasks simultaneously with the ability to appropriately prioritize.

Your day to day:

  • Own and enhance the reliability of services deployed across various cloud regions. You will proactively monitor, automate, and scale services to ensure seamless uptime and performance with an eye on cost.
  • Foster and advocate for a DevOps culture that emphasizes automation, self-service, and engineering excellence. Enable development teams to manage and deploy applications seamlessly with minimal intervention.
  • Lead the containerization, deployment, and scaling of microservices and data pipelines on Google Kubernetes Engine (GKE), with a strong emphasis on reliability and fault tolerance.
  • Set up and manage cloud infrastructure using Terraform enabling automated, repeatable provisioning and management of cloud infrastructure.
  • Continuously enhance and automate alerting, incident detection, and recovery mechanisms for critical applications and services to minimize downtime and improve system reliability.
  • Participate in an on-call rotation to meet business SLAs, quickly troubleshoot and resolve issues, and document runbooks for consistent incident response processes.
  • Work closely with Product Owners, Engineering Managers, and cross-functional teams in Agile Scrum and Kanban workflows to deliver iterative improvements and meet evolving business needs.
  • Perform impact analysis during incidents, collaborate with teams for root cause analysis, and implement preventive measures to avoid recurrence.
  • Champion a service-first mindset while supporting engineering teams, swiftly addressing their needs and clearing blockers to help them maintain development velocity on a weekly basis.

Our Benefits:

At , we’re committed to building an equitable and inclusive global economy. And we can’t do this without our most important asset—you. That’s why we offer benefits to help you thrive in every stage of life. We champion your financial, physical, and mental health by offering valuable benefits and resources to help you care for the whole you.

We have great benefits including a flexible work environment, employee shares options, health and life insurance and more. To learn more about our benefits please visit https://www.paypalbenefits.com.

Who We Are:

Click Here to learn more about our culture and community.

Commitment to Diversity and Inclusion

provides equal employment opportunity (EEO) to all persons regardless of age, color, national origin, citizenship status, physical or mental disability, race, religion, creed, gender, sex, pregnancy, sexual orientation, gender identity and/or expression, genetic information, marital status, status with regard to public assistance, veteran status, or any other characteristic protected by federal, state, or local law. In addition, will provide reasonable accommodations for qualified individuals with disabilities. If you are unable to submit an application because of incompatible assistive technology or a disability, please contact us at paypalglobaltalentacquisition@paypal.com.

Belonging at :

Our employees are central to advancing our mission, and we strive to create an environment where everyone can do their best work with a sense of purpose and belonging. Belonging at means creating a workplace with a sense of acceptance and security where all employees feel included and valued. We are proud to have a diverse workforce reflective of the merchants, consumers, and communities that we serve, and we continue to take tangible actions to cultivate inclusivity and belonging at .

Any general requests for consideration of your skills, please Join our Talent Community.

We know the confidence gap and imposter syndrome can get in the way of meeting spectacular candidates. Please don’t hesitate to apply.

About Us

!PYPL Careers

is on a mission to revolutionize commerce globally. We’re driving our company, industry, and society forward with vision and velocity. With our commitment to excellence, innovation, and talent, the possibilities are limitless.

Learn more at PayPal.com/Jobs

  • Learn more about privacy-related questions or data retention.
  • If you would like your profile to be deleted from our system, please let us know

Please note that this site has updated features that can’t run on older versions of Internet Explorer. Please use IE11 or MS Edge for an optimal experience.

Read More

Candidate Account Creation Guidelines

When applying for a job you are required to create an account, if you have already created an account - click Sign In.

Creating an account will allow you to follow the progress of your applications. Our system does have some requirements that will help us process your application, below are some guidelines for creation of your account:

  • Provide full legal First Name/Family Name – this is important for us to ensure our future hires have the right system set up.
  • Please Capitalize first letter of your First and Last Name.
  • Please avoid using fully capitalized text for your First and/or Last Name.
  • NOTE: If your name is hyphenated or has multiple capitalization, please use the same format as your government ID.

Read More

Internet

Please note that this site has updated features that can’t run on older versions of Internet Explorer. Please use IE11 or MS Edge for an optimal experience.

Follow Us

  • [](https://twitter.com/PayPal "X")
  • [](https://www.linkedin.com/company/paypal/careers?trk=topnavcareers "LinkedIn")
  • [](https://www.facebook.com/paypalcareers/ "Facebook")

Candidate Privacy

© 2025 Workday, Inc. All rights reserved.

Similar Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Similar Skill Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Jobs in San Jose, California, United States

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Devops Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

About The Company

PayPal is on a mission to revolutionize commerce globally. We’re driving our company, industry, and society forward with vision and velocity. With our commitment to excellence, innovation, and talent, the possibilities are limitless. ​Learn more at. Learn more about privacy-related questions or data retention.

San Jose, California, United States (Hybrid)

San Jose, California, United States (Hybrid)

New York, New York, United States (Hybrid)

San Jose, California, United States (Hybrid)

San Jose, California, United States (Hybrid)

Chicago, Illinois, United States (Hybrid)

San Jose, California, United States (Hybrid)

Austin, Texas, United States (Remote)

San Jose, California, United States (On-Site)

San Jose, California, United States (Hybrid)

View All Jobs

Get notified when new jobs are added by PayPal

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug