Senior Site Reliability Engineer

3 Hours ago • 8 Years + • Devops • $155,000 PA - $250,000 PA

Job Summary

Job Description

Glean is seeking a skilled Senior Site Reliability Engineer (SRE) to ensure the reliability, availability, and performance of its cloud-based services and applications. The role involves working with engineering teams to design, build, and maintain robust cloud infrastructure, focusing on automation and scaling operations in a hybrid cloud environment. Key responsibilities include technical leadership, mentorship, driving technical excellence, implementing resilient cloud architectures, managing incidents with a blameless postmortem culture, developing automation scripts and tools, optimizing cloud infrastructure for performance and cost-effectiveness, collaborating on security and compliance, designing monitoring and alerting systems, and participating in the software development lifecycle. The ideal candidate has 8+ years of SRE experience, 5+ years in software development, and 2+ years in team leadership.
Must have:
  • 8+ years of SRE experience
  • 5+ years software development
  • 2+ years team leadership
  • Cloud platform knowledge (GCP, AWS, Azure)
  • Containerization (Docker, Kubernetes)
  • Infrastructure as Code (Terraform)
  • Networking and security principles
  • Monitoring and alerting tools
Perks:
  • Competitive compensation
  • Medical, Vision, and Dental coverage
  • Generous time-off policy
  • 401k plan
  • Home office improvement stipend
  • Annual education and wellness stipends
  • Healthy lunches daily

Job Details

About Glean:

Founded in 2019, Glean is an innovative AI-powered knowledge management platform designed to help organizations quickly find, organize, and share information across their teams. By integrating seamlessly with tools like Google Drive, Slack, and Microsoft Teams, Glean ensures employees can access the right knowledge at the right time, boosting productivity and collaboration. The company’s cutting-edge AI technology simplifies knowledge discovery, making it faster and more efficient for teams to leverage their collective intelligence.

Glean was born from Founder & CEO Arvind Jain’s deep understanding of the challenges employees face in finding and understanding information at work. Seeing firsthand how fragmented knowledge and sprawling SaaS tools made it difficult to stay productive, he set out to build a better way - an AI-powered enterprise search platform that helps people quickly and intuitively access the information they need. Since then, Glean has evolved into the leading Work AI platform, combining enterprise-grade search, an AI assistant, and powerful application- and agent-building capabilities to fundamentally redefine how employees work.

About the Role:

We are seeking a skilled and motivated Senior Site Reliability Engineer (SRE) to become a valuable addition to our dynamic and innovative team. As a SRE, you will play a critical role in ensuring the reliability, availability, and performance of our cloud-based services and applications. You will work closely with our engineering teams to design, build, and maintain robust, scalable, and highly available cloud infrastructure.

Much of our software development focuses on building infrastructure to scale our operations in a hybrid cloud environment and eliminating work through automation. On the SRE team, you’ll have the opportunity to manage the complex challenges of scale and fast growth which are unique to Glean, while using your expertise in coding, algorithms, problem-solving, and SRE practices. We keep Glean applications up and running, ensuring our customers have the best and most reliable experience possible.

You are:

  • Technical Leadership and Mentorship: Play a key role in driving technical excellence and fostering a culture of reliability across engineering teams. You will lead by example, setting best practices for incident management, performance optimization, and automation. Influence best practices, drive cross-team collaborations, and contribute to the execution of key objectives in alignment with engineering leadership and cross-functional partners. Establish strong technical credibility, shaping architectural decisions and ensuring the delivery of high-quality, reliable systems.
  • Ensure High Availability: Implement and maintain resilient cloud architectures, monitor system performance, and proactively identify and resolve potential bottlenecks or points of failure. 
  • Incident Management: Participate in primary oncall rotation; cultivate technical curiosity and growth mindset, and a blameless postmortem culture within the team. Continuously optimize the on-call process for sustainability and efficiency.
  • Automation and Tooling: Develop and maintain automation scripts, tools, and processes to streamline system deployment, monitoring, and management tasks. Your contributions will be vital in efficiently scaling cloud operations.
  • Performance Optimization: Optimize cloud infrastructure and applications for performance, scalability, and cost-effectiveness.
  • Security and Compliance: Collaborate with security engineers to implement best practices and ensure compliance with security standards and policies.
  • Monitoring and Alerting: Design and configure advanced monitoring systems to gain insights into system behavior, set up alerts, and respond proactively to potential issues. Create and maintain comprehensive dashboards and playbooks for production on-call.
  • Software Development Consultation: Engage actively in the entire software development lifecycle. Participate in system design reviews and provide valuable SRE insights during launch reviews, influencing and enhancing system architecture.

About you: 

  • Bachelor’s degree in Computer Science, a related field, or equivalent practical experience.
  • 8+ years of experience in a senior-level role within Site Reliability Engineering or similar role, particularly in managing cloud-based services and infrastructure.
  • 5+ years of experience with software development in one or more programming languages.
  • 2+ years of experience managing people or teams, leading projects, and designing, analyzing, and troubleshooting distributed systems running in Cloud.
  • Strong knowledge of cloud platforms such as Google Cloud Platform, AWS, or Azure.
  • Practical experience with containerization technologies, including Docker and Kubernetes. Familiarity with infrastructure as code tools like Terraform is essential.
  • Solid understanding of networking, security principles, and best SRE and security practices.
  • Proficiency in using monitoring and alerting tools to detect and respond to potential issues effectively

Location: 

  • This role is hybrid (3 days a week in one of our Bay Area offices)

Compensation & Benefits:

The standard base salary range for this position is $155,000 - $250,000 annually. Compensation offered will be determined by factors such as location, level, job-related knowledge, skills, and experience. Certain roles may be eligible for variable compensation, equity, and benefits.

We offer a comprehensive benefits package including competitive compensation, Medical, Vision, and Dental coverage, generous time-off policy, and the opportunity to contribute to your 401k plan to support your long-term goals. When you join, you'll receive a home office improvement stipend, as well as an annual education and wellness stipends to support your growth and wellbeing. We foster a vibrant company culture through regular events, and provide healthy lunches daily to keep you fueled and focused.

We are a diverse bunch of people and we want to continue to attract and retain a diverse range of people into our organization. We're committed to an inclusive and diverse company. We do not discriminate based on gender, ethnicity, sexual orientation, religion, civil or family status, age, disability, or race.

Similar Jobs

MRI Software - Senior Manager, Client Relationship Management

MRI Software

Solon, Ohio, United States (Hybrid)
1 Month ago
Ansys - UX Designer

Ansys

Biot, Provence-Alpes-Côte D'Azur, France (On-Site)
1 Month ago
WebTech Corporation - Senior Project Engineer

WebTech Corporation

West Melbourne, Florida, United States (On-Site)
1 Month ago
Ion - Senior Product Manager – Financial Services

Ion

New York, United States (On-Site)
2 Months ago
cirrus logic - Supplier Quality Engineer

cirrus logic

Edinburgh, Scotland, United Kingdom (Hybrid)
3 Months ago
bytedance - Machine Learning Engineer - Machine Learning Infrastructure

bytedance

San Jose, California, United States (On-Site)
8 Months ago
Zazz - Solutions Architect - Backend Development

Zazz

India (On-Site)
6 Months ago
Canonical - Software Engineer - Cloud Images

Canonical

(Remote)
1 Month ago
Intel  - Senior Infrastructure Engineer - Windows OS

Intel

Phoenix, Arizona, United States (On-Site)
1 Year ago
Match Group - Backend Software Engineer (Matching Platform)

Match Group

Seoul, South Korea (Hybrid)
1 Month ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Tesla - Senior High Voltage Battery Mechanical Test Engineer

Tesla

North Brabant, Netherlands (On-Site)
4 Months ago
Zuora - Principal Software Engineer

Zuora

Chennai, Tamil Nadu, India (On-Site)
1 Month ago
PhonePe - Social Media Manager

PhonePe

Bengaluru, Karnataka, India (On-Site)
2 Weeks ago
Blinkhealth - Pharmacy Prior Authorization Specialist

Blinkhealth

Pittsburgh, Pennsylvania, United States (On-Site)
3 Weeks ago
Enphase Energy - Commodity/Senior Commodity Specialist - NPI

Enphase Energy

Bengaluru, Karnataka, India (On-Site)
1 Month ago
Socure - Senior Software Engineer

Socure

United States (Remote)
1 Month ago
Google - Senior Staff Software Engineer, Google Cloud Compute

Google

Sunnyvale, California, United States (On-Site)
2 Months ago
Unity - Senior Big Data & ML Engineer

Unity

Tel Aviv-Yafo, Tel Aviv District, Israel (Remote)
4 Months ago
Scale AI - Integration Architect

Scale AI

San Francisco, California, United States (On-Site)
2 Months ago
Trellix - Sr. Software Development Engineer

Trellix

Cork, County Cork, Ireland (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Jobs in Palo Alto, California, United States

Figma - Product Manager, Design Tools and Product Foundations

Figma

United States (Remote)
2 Weeks ago
Meta - Software Engineer, Infrastructure

Meta

Atlanta, Georgia, United States (Remote)
7 Months ago
Rockstar Games - MongoDB Administrator: Database Operations

Rockstar Games

Carlsbad, California, United States (On-Site)
1 Month ago
Zeeco, Inc. - Test Facility Controls Engineer

Zeeco, Inc.

Broken Arrow, Oklahoma, United States (On-Site)
1 Month ago
Blue wire software - General Manager

Blue wire software

Albuquerque, New Mexico, United States (On-Site)
1 Week ago
Pocket Worlds - Staff Full-Stack Engineer (Backend Leaning)

Pocket Worlds

United States (Remote)
3 Months ago
Blue wire software - Assistant Manager

Blue wire software

Allentown, Pennsylvania, United States (On-Site)
1 Week ago
zoox - Senior/Staff Software Engineer - Learned Trajectory Machine Learning Engineer

zoox

Foster City, California, United States (Hybrid)
8 Months ago
cirrus logic - Senior CAD Software Engineer

cirrus logic

Austin, Texas, United States (Hybrid)
1 Month ago
Postman - Senior Field Marketing Manager

Postman

San Francisco, California, United States (Hybrid)
4 Weeks ago

Get notifed when new similar jobs are uploaded

Devops Jobs

Rackspace Technology - Site Reliability Engineer - III (UK Shift)

Rackspace Technology

India (Remote)
2 Weeks ago
Ansys - Lead R&D Engineer (Cloud Platform Developer)

Ansys

Waterloo, Ontario, Canada (Remote)
1 Month ago
Mashgin - Software Engineer, Infrastructure

Mashgin

Palo Alto, California, United States (Hybrid)
8 Months ago
Turbulent - Senior DevOps Engineer

Turbulent

Montreal, Quebec, Canada (On-Site)
3 Months ago
bytedance - Site Reliability Engineer, Edge Services

bytedance

Seattle, Washington, United States (On-Site)
4 Months ago
Glean - Solutions Engineer

Glean

Tokyo, Japan (Remote)
1 Month ago
Reddit - Senior Software Engineer, Ads Experimentation Platform

Reddit

Ontario, Canada (Remote)
1 Month ago
Blitre Rewards - Cloud Architect

Blitre Rewards

New York, New York, United States (On-Site)
2 Months ago
Safe security - Software Development Engineer III - Platform

Safe security

Bengaluru, Karnataka, India (On-Site)
2 Months ago
bytedance - Senior Software Development Engineer - Cloud Native Databases

bytedance

San Jose, California, United States (On-Site)
6 Months ago

Get notifed when new similar jobs are uploaded

About The Company

We’re on a mission to make knowledge work faster and more humane. We believe that AI will fundamentally transform how people work. In the future, everyone will work in tandem with expert AI assistants who find knowledge, create and synthesize information, and execute work. These assistants will free people up to focus on the higher-level, creative aspects of their work. We’re building a system of intelligence for every company in the world. On the surface, you can think of it as Google + ChatGPT for the enterprise. Under the hood, our platform is the connective tissue between AI and knowledge. It brings all of a company’s knowledge together, understands it at a deep level, provides industry-leading search relevance over it, and connects it to generative AI agents and applications.

Bengaluru, Karnataka, India (Hybrid)

Bengaluru, Karnataka, India (Hybrid)

California, United States (On-Site)

San Francisco, California, United States (Hybrid)

Palo Alto, California, United States (Hybrid)

St. Louis, Missouri, United States (Remote)

Detroit, Michigan, United States (Remote)

Austin, Texas, United States (Remote)

Palo Alto, California, United States (Hybrid)

Palo Alto, California, United States (Hybrid)

View All Jobs

Get notified when new jobs are added by Glean