Intermediate Site Reliability Engineer

1 Day ago • All levels • $103,600 PA - $222,000 PA

Job Summary

Job Description

As an Intermediate Site Reliability Engineer (SRE) at GitLab, you will be responsible for ensuring the smooth operation of all user-facing services and production systems. This role requires a blend of operational expertise and software engineering skills to implement best practices for availability, reliability, and scalability. You will design and implement scalable networking infrastructure, collaborate with cross-functional teams, respond to incidents, and automate operational tasks. The role involves specialization in systems, algorithms, and distributed systems. The team is focused on automation and building reliable systems, supporting one of the largest single-tenancy open-source SaaS sites.
Must have:
  • Google Cloud Platform networking expertise.
  • Experience with Terraform infrastructure as code.
  • Experience with Ansible or Chef configuration management tools.
  • Experience with the Kubernetes ecosystem, including Helm.
  • Programming skills in Ruby or Go.
  • Understanding of network protocols (TCP/IP, HTTP/HTTPS, DNS)
  • Comfortable with scripting languages (Ruby, Go, Bash).
  • Experience with GitLab CI or equivalent.

Job Details

GitLab is an open core software company that develops the most comprehensive AI-powered DevSecOps Platform, used by more than 100,000 organizations. Our mission is to enable everyone to contribute to and co-create the software that powers our world. When everyone can contribute, consumers become contributors, significantly accelerating the rate of human progress. This mission is integral to our culture, influencing how we hire, build products, and lead our industry. We make this possible at GitLab by running our operations on our product and staying aligned with our values. Learn more about Life at GitLab.

Thanks to products like Duo Enterprise, and Duo Workflow, customers get the benefit of AI at every stage of the SDLC. The same principles built into our products are reflected in how our team works: we embrace AI as a core productivity multiplier. All team members are encouraged and expected to incorporate AI into their daily workflows to drive efficiency, innovation, and impact across our global organisation.

An overview of this role

GitLab is a complete DevOps platform, delivered as a single application. From project planning and source code management to CI/CD, monitoring, and security, we help teams deliver software faster and more efficiently while strengthening their security and compliance postures.

As an Intermediate Site Reliability Engineer (SRE) at GitLab, you are responsible for keeping all user-facing services and other GitLab production systems running smoothly. SREs are a blend of pragmatic operators and software craftspeople that apply sound engineering principles, operational discipline, and mature automation to our operating environments and the GitLab codebase.

GitLab SREs specialize in systems (operating systems, storage subsystems, networking), while implementing best practices for availability, reliability and scalability, with varied interests in algorithms and distributed systems.

What you’ll do  

  • Design and implement a highly scalable networking infrastructure to support the needs of current and future GitLab platforms and offerings.
  • Collaborate closely with cross-functional teams and other teams throughout Infrastructure-Platforms on projects to drive GitLab’s future.
  • Respond to incidents on an on-call rotation (our team is distributed globally, so you are only on call during your daytime hours!) and participate in incident review.
  • Lead initiatives through problem definition, scoping, design, and project management.
  • Act as subject matter experts within the GitLab Infrastructure-Platforms department, specializing in knowledge of our networking and rate limiting services.
  • Automate every operational task.

 

What you’ll bring 

  • Google Cloud Platform expertise, specifically around networking (VPCs, subnets, load balancers), GKE configuration, and scaling.
  • Experience with Terraform infrastructure as code.
  • Experience with configuration management tools such as Ansible and Chef.
  • Experience with the Kubernetes ecosystem, including Helm.
  • Programming skills and professional experience in Ruby or Go.
  • Understanding of network protocols (TCP/IP, HTTP/HTTPS, DNS)
  • Familiarity with network observability tools and traffic analysis
  • Comfortable with scripting languages (Ruby, Go, Bash) for automation
  • Experience with GitLab CI or equivalent
  • Ability to clearly define problems and think beyond initial solutions, looking at how to make things better in the future.
  • A drive for automating everything.
  • Ability to be a manager of one and have a strong bias for action.
  • An independent, proactive, and self-organized mindset.
  • Strong ability to clearly communicate asynchronously.
  • Excitement to be doing something different every day from project work to production change requests to emergency response.

About the team

The Production Engineering Foundations team owns the networking infrastructure for GitLab from edge to ingress. Running the largest GitLab instance in existence (and in fact, one of the largest single-tenancy open-source SaaS sites on the Internet) means we are constantly faced with unique and rewarding challenges that directly impact our users every day. Our future is all about increasing automation and enabling other teams by building paved roads for things like rate limiting and edge networks, so we can continue to scale even bigger with enterprise-level expectations around reliability and availability. Thanks to our Transparency value, you can see how we work on our team page. You can even see what we’re working on right now.

Similar Jobs

Contentstack - Senior Engineer I - QA

Contentstack

Virar, Maharashtra, India (Hybrid)
1 Day ago
Info Stretch - Senior Engineer

Info Stretch

Mumbai, Maharashtra, India (On-Site)
6 Months ago
Synechron - Senior Java Spring Boot Developer (Cloud and Database Technologies)

Synechron

Pune, Maharashtra, India (On-Site)
5 Days ago
The Walt Disney Company - Lead Software Engineer - Front End

The Walt Disney Company

Glendale, California, United States (On-Site)
3 Weeks ago
Super - Senior Full-Stack Software Engineer ( Remote! )

Super

Toronto, Ontario, Canada (Remote)
6 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Gitlab - Executive Business Administrator, Legal

Gitlab

(Remote)
1 Day ago
N-iX - Middle Frontend Developer

N-iX

Poland (Hybrid)
1 Month ago
Nasdaq - Senior QA Specialist - Regulatory Compliance Technology

Nasdaq

Mumbai, Maharashtra, India (Hybrid)
1 Week ago
EveryMatrix - DevOps Automation Engineer

EveryMatrix

London, England, United Kingdom (Hybrid)
1 Day ago
Gitlab - Major Account Executive

Gitlab

Paris, Île-de-France, France (On-Site)
2 Weeks ago
Nagarro - Senior Staff Engineer, NodeJS

Nagarro

India (Remote)
6 Months ago
Hogarth - SDET - Software Development engineer in Test

Hogarth

Sunnyvale, California, United States (Hybrid)
1 Week ago
Gitlab - Area Sales Manager

Gitlab

(Remote)
2 Weeks ago
Nagarro - Staff Engineer, Java

Nagarro

Bengaluru, Karnataka, India (On-Site)
6 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Worldwide

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Similar Category Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!