Senior Site Reliability Engineer

1 Month ago • 8 Years + • Devops • $155,000 PA - $250,000 PA

Job Summary

Job Description

Glean is seeking a skilled Senior Site Reliability Engineer (SRE) to ensure the reliability, availability, and performance of its cloud-based services and applications. The role involves working with engineering teams to design, build, and maintain robust cloud infrastructure, focusing on automation and scaling operations in a hybrid cloud environment. Key responsibilities include technical leadership, mentorship, driving technical excellence, implementing resilient cloud architectures, managing incidents with a blameless postmortem culture, developing automation scripts and tools, optimizing cloud infrastructure for performance and cost-effectiveness, collaborating on security and compliance, designing monitoring and alerting systems, and participating in the software development lifecycle. The ideal candidate has 8+ years of SRE experience, 5+ years in software development, and 2+ years in team leadership.
Must have:
  • 8+ years of SRE experience
  • 5+ years software development
  • 2+ years team leadership
  • Cloud platform knowledge (GCP, AWS, Azure)
  • Containerization (Docker, Kubernetes)
  • Infrastructure as Code (Terraform)
  • Networking and security principles
  • Monitoring and alerting tools
Perks:
  • Competitive compensation
  • Medical, Vision, and Dental coverage
  • Generous time-off policy
  • 401k plan
  • Home office improvement stipend
  • Annual education and wellness stipends
  • Healthy lunches daily

Job Details

About Glean:

Founded in 2019, Glean is an innovative AI-powered knowledge management platform designed to help organizations quickly find, organize, and share information across their teams. By integrating seamlessly with tools like Google Drive, Slack, and Microsoft Teams, Glean ensures employees can access the right knowledge at the right time, boosting productivity and collaboration. The company’s cutting-edge AI technology simplifies knowledge discovery, making it faster and more efficient for teams to leverage their collective intelligence.

Glean was born from Founder & CEO Arvind Jain’s deep understanding of the challenges employees face in finding and understanding information at work. Seeing firsthand how fragmented knowledge and sprawling SaaS tools made it difficult to stay productive, he set out to build a better way - an AI-powered enterprise search platform that helps people quickly and intuitively access the information they need. Since then, Glean has evolved into the leading Work AI platform, combining enterprise-grade search, an AI assistant, and powerful application- and agent-building capabilities to fundamentally redefine how employees work.

About the Role:

We are seeking a skilled and motivated Senior Site Reliability Engineer (SRE) to become a valuable addition to our dynamic and innovative team. As a SRE, you will play a critical role in ensuring the reliability, availability, and performance of our cloud-based services and applications. You will work closely with our engineering teams to design, build, and maintain robust, scalable, and highly available cloud infrastructure.

Much of our software development focuses on building infrastructure to scale our operations in a hybrid cloud environment and eliminating work through automation. On the SRE team, you’ll have the opportunity to manage the complex challenges of scale and fast growth which are unique to Glean, while using your expertise in coding, algorithms, problem-solving, and SRE practices. We keep Glean applications up and running, ensuring our customers have the best and most reliable experience possible.

You are:

  • Technical Leadership and Mentorship: Play a key role in driving technical excellence and fostering a culture of reliability across engineering teams. You will lead by example, setting best practices for incident management, performance optimization, and automation. Influence best practices, drive cross-team collaborations, and contribute to the execution of key objectives in alignment with engineering leadership and cross-functional partners. Establish strong technical credibility, shaping architectural decisions and ensuring the delivery of high-quality, reliable systems.
  • Ensure High Availability: Implement and maintain resilient cloud architectures, monitor system performance, and proactively identify and resolve potential bottlenecks or points of failure. 
  • Incident Management: Participate in primary oncall rotation; cultivate technical curiosity and growth mindset, and a blameless postmortem culture within the team. Continuously optimize the on-call process for sustainability and efficiency.
  • Automation and Tooling: Develop and maintain automation scripts, tools, and processes to streamline system deployment, monitoring, and management tasks. Your contributions will be vital in efficiently scaling cloud operations.
  • Performance Optimization: Optimize cloud infrastructure and applications for performance, scalability, and cost-effectiveness.
  • Security and Compliance: Collaborate with security engineers to implement best practices and ensure compliance with security standards and policies.
  • Monitoring and Alerting: Design and configure advanced monitoring systems to gain insights into system behavior, set up alerts, and respond proactively to potential issues. Create and maintain comprehensive dashboards and playbooks for production on-call.
  • Software Development Consultation: Engage actively in the entire software development lifecycle. Participate in system design reviews and provide valuable SRE insights during launch reviews, influencing and enhancing system architecture.

About you: 

  • Bachelor’s degree in Computer Science, a related field, or equivalent practical experience.
  • 8+ years of experience in a senior-level role within Site Reliability Engineering or similar role, particularly in managing cloud-based services and infrastructure.
  • 5+ years of experience with software development in one or more programming languages.
  • 2+ years of experience managing people or teams, leading projects, and designing, analyzing, and troubleshooting distributed systems running in Cloud.
  • Strong knowledge of cloud platforms such as Google Cloud Platform, AWS, or Azure.
  • Practical experience with containerization technologies, including Docker and Kubernetes. Familiarity with infrastructure as code tools like Terraform is essential.
  • Solid understanding of networking, security principles, and best SRE and security practices.
  • Proficiency in using monitoring and alerting tools to detect and respond to potential issues effectively

Location: 

  • This role is hybrid (3 days a week in one of our Bay Area offices)

Compensation & Benefits:

The standard base salary range for this position is $155,000 - $250,000 annually. Compensation offered will be determined by factors such as location, level, job-related knowledge, skills, and experience. Certain roles may be eligible for variable compensation, equity, and benefits.

We offer a comprehensive benefits package including competitive compensation, Medical, Vision, and Dental coverage, generous time-off policy, and the opportunity to contribute to your 401k plan to support your long-term goals. When you join, you'll receive a home office improvement stipend, as well as an annual education and wellness stipends to support your growth and wellbeing. We foster a vibrant company culture through regular events, and provide healthy lunches daily to keep you fueled and focused.

We are a diverse bunch of people and we want to continue to attract and retain a diverse range of people into our organization. We're committed to an inclusive and diverse company. We do not discriminate based on gender, ethnicity, sexual orientation, religion, civil or family status, age, disability, or race.

Similar Jobs

Penumbrainc - HR Assistant - Temp

Penumbrainc

Alameda, California, United States (On-Site)
1 Month ago
Betson Group - Accountant

Betson Group

Vilnius, Vilnius County, Lithuania (On-Site)
1 Month ago
Tesla - Aftersales Support Specialist

Tesla

North Holland, Netherlands (On-Site)
5 Months ago
dun bradstreet - People Business Partner, Sales Organization

dun bradstreet

Jacksonville, Florida, United States (Hybrid)
2 Months ago
Cadence - Facility Security Officer (FSO)

Cadence

San Jose, California, United States (On-Site)
2 Months ago
Sima AI - Staff Engineer, Salesforce Automation Specialist

Sima AI

Bengaluru, Karnataka, India (On-Site)
1 Week ago
CharacterAI - Platform Engineer, Frontend

CharacterAI

Palo Alto, California, United States (On-Site)
3 Months ago
Ion - Senior DevSecOps Engineer, Italy

Ion

London, England, United Kingdom (On-Site)
9 Months ago
oportun - Sr. Cloud Engineer

oportun

Mexico (Remote)
2 Weeks ago
GoTo Group - Senior DevOps Engineer

GoTo Group

Jakarta, Indonesia (On-Site)
4 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Tencent - Senior Project Manager

Tencent

Shanghai, China (On-Site)
2 Weeks ago
Ion - Internal Audit & Compliance Manager

Ion

Pisa, Tuscany, Italy (On-Site)
9 Months ago
Nagarro - Associate Engineer

Nagarro

New York, New York, United States (On-Site)
1 Year ago
Lionbridge Games - Games Tester

Lionbridge Games

Mexico City, Mexico City, Mexico (On-Site)
5 Months ago
Siemens  - Data Integrity Analyst

Siemens

Noida, Uttar Pradesh, India (On-Site)
3 Weeks ago
Trackman - Customer Service Specialist (Tier 1)

Trackman

(On-Site)
4 Months ago
deel. - SEC Analyst

deel.

United States (Remote)
1 Week ago
BTF  - Production Management Assistant

BTF

Cologne, North Rhine-Westphalia, Germany (On-Site)
1 Month ago
AGBO - Executive Assistant, Innovation

AGBO

Los Angeles, California, United States (On-Site)
3 Months ago
Tesla - Compliance Partner

Tesla

London, England, United Kingdom (On-Site)
5 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Palo Alto, California, United States

Apple - SAP Logistics Operations Lead

Apple

Sunnyvale, California, United States (On-Site)
1 Month ago
Yahoo - Senior Product Designer, Payments & Subscriptions

Yahoo

United States (Hybrid)
1 Year ago
CME Group - Salesforce Technical Manager

CME Group

Chicago, Illinois, United States (On-Site)
2 Weeks ago
Critical mass - VP, Client Partner

Critical mass

New York, United States (On-Site)
1 Month ago
Notion - Head of Business Partnership Finance

Notion

San Francisco, California, United States (On-Site)
2 Weeks ago
Pika - Product Designer

Pika

Palo Alto, California, United States (On-Site)
8 Months ago
bytedance - Research Engineer / Scientist - Storage for LLM

bytedance

Seattle, Washington, United States (On-Site)
3 Months ago
Penn Interactive - Executive Host

Penn Interactive

Denver, Colorado, United States (Remote)
2 Months ago
Prophecy - Senior Partner Manager, Partnerships

Prophecy

United States (Remote)
2 Months ago
bytedance - SOC System Architect

bytedance

San Jose, California, United States (On-Site)
9 Months ago

Get notifed when new similar jobs are uploaded

Devops Jobs

Turbulent - Senior DevOps Engineer

Turbulent

Montreal, Quebec, Canada (On-Site)
4 Months ago
Harvey - Software Engineer, Site Reliability Engineer (SRE)

Harvey

San Francisco, California, United States (On-Site)
1 Week ago
Monarch Money - Senior Software Engineer, Database Infrastructure

Monarch Money

United States (Remote)
3 Months ago
Survay Monkey - Staff Site Reliability Engineer

Survay Monkey

Bengaluru, Karnataka, India (Hybrid)
3 Months ago
Sonar Source - Solutions Engineer

Sonar Source

Austin, Texas, United States (On-Site)
5 Months ago
Remedy Entertainment Plc - Senior/Lead Build Engineer

Remedy Entertainment Plc

Helsinki, Uusimaa, Finland (Hybrid)
5 Months ago
Shield AI - Sales Solutions Engineer, APAC (R3660)

Shield AI

Seoul, South Korea (On-Site)
1 Week ago
Xsolla - DevOps Engineer

Xsolla

Raleigh, North Carolina, United States (Hybrid)
1 Month ago
T systems - Cloud Engineer - Azure Administrator

T systems

Pune, Maharashtra, India (On-Site)
2 Weeks ago

Get notifed when new similar jobs are uploaded

About The Company

We’re on a mission to make knowledge work faster and more humane. We believe that AI will fundamentally transform how people work. In the future, everyone will work in tandem with expert AI assistants who find knowledge, create and synthesize information, and execute work. These assistants will free people up to focus on the higher-level, creative aspects of their work. We’re building a system of intelligence for every company in the world. On the surface, you can think of it as Google + ChatGPT for the enterprise. Under the hood, our platform is the connective tissue between AI and knowledge. It brings all of a company’s knowledge together, understands it at a deep level, provides industry-leading search relevance over it, and connects it to generative AI agents and applications.

Palo Alto, California, United States (Hybrid)

San Francisco, California, United States (Hybrid)

San Francisco, California, United States (Hybrid)

San Francisco, California, United States (Hybrid)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (Hybrid)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (Hybrid)

Palo Alto, California, United States (Hybrid)

Bengaluru, Karnataka, India (On-Site)

View All Jobs

Get notified when new jobs are added by Glean