SRE Manager (India)

4 Months ago • 8 Years + • DevOps

Job Summary

Job Description

As an SRE Manager at Glean, you'll lead a team of engineers ensuring the reliability and performance of cloud-based services. Responsibilities include team leadership and development, implementing resilient cloud architectures, managing incidents, automating processes, optimizing performance, ensuring security and compliance, designing monitoring systems, and consulting on software development. This role requires strong cloud platform knowledge (GCP, AWS, or Azure), experience with containerization (Docker, Kubernetes), and infrastructure as code (Terraform).
Must have:
  • Lead and mentor SRE team
  • Ensure high availability of cloud services
  • Manage incidents and optimize on-call
  • Automate processes and develop tools
  • Optimize cloud infrastructure performance
  • Implement security best practices
  • Design and configure monitoring systems
  • 8+ years senior SRE experience
  • 5+ years software development experience
  • Cloud platform expertise (GCP, AWS, Azure)
  • Experience with Docker and Kubernetes
  • Knowledge of Terraform

Job Details

ABOUT GLEAN:

We’re on a mission to bring people the knowledge they need to make a difference in the world. 

Glean was founded by a seasoned team of former Google search and Facebook engineers, who wondered why we don’t have an easier way of finding what we need at work. In our personal lives, we have tools to help us find pretty much whatever we need. Why don’t we have that at work? And that was the beginning of Glean.

Glean searches across all your company’s apps to help you find exactly what you need and discover the things you should know. We’re a diverse team of curious and creative people who want to help each other get big things done—so we can help other teams do the same. 

We're backed by some of the Valley's leading venture capitalists—including Sequoia, Kleiner Perkins, Lightspeed, and General Catalyst—and have assembled a world-class team with senior leadership experience at Google, Slack, Facebook, Dropbox, Rubrik, Uber, Intercom, Pinterest, Palantir, and others.

ABOUT THE ROLE:

As an SRE Manager at Glean, you will lead a team of software and system engineers in ensuring the reliability, availability, and performance of our cloud-based services and applications. You will collaborate closely with our engineering teams to design, build, and maintain robust, scalable, and highly available cloud infrastructure. Your expertise in team leadership, coding, algorithms, problem-solving, and SRE practices will be crucial in managing the complex challenges of scale and fast growth unique to Glean.

What you will do and achieve:

Team Leadership and Development: Lead, mentor, and support a team of software and systems engineers, fostering their growth and success. Set team priorities and drive the execution of team OKRs with input from engineering leadership and cross-functional partners. Establish technical credibility and influence the technical direction and high-quality delivery from the team.

  • Ensure High Availability: Implement and maintain resilient cloud architectures, monitor system performance, and proactively identify and resolve potential bottlenecks or points of failure.
  • Incident Management: Participate in primary oncall rotation; cultivate technical curiosity and growth mindset, and a blameless postmortem culture within the team. Continuously optimize the on-call process for sustainability and efficiency.
  • Automation and Tooling: Develop and maintain automation scripts, tools, and processes to streamline system deployment, monitoring, and management tasks. Your contributions will be vital in efficiently scaling cloud operations.
  • Performance Optimization: Optimize cloud infrastructure and applications for performance, scalability, and cost-effectiveness.
  • Security and Compliance: Collaborate with security engineers to implement best practices and ensure compliance with security standards and policies.
  • Monitoring and Alerting: Design and configure advanced monitoring systems to gain insights into system behavior, set up alerts, and respond proactively to potential issues.
  • Create and maintain comprehensive dashboards and playbooks for production on-call.
  • Software Development Consultation: Engage actively in the entire software development lifecycle. Participate in system design reviews and provide valuable SRE insights during launch reviews, influencing and enhancing system architecture.

Who you are:

  • Bachelor’s degree in Computer Science, a related field, or equivalent practical experience.
  • 8+ years of experience in a senior-level role within Site Reliability Engineering or similar role, particularly in managing cloud-based services and infrastructure.
  • 5+ years of experience with software development in one or more programming languages.
  • 2+ years of experience managing people or teams, leading projects, and designing, analyzing, and troubleshooting distributed systems running in Cloud.
  • Strong knowledge of cloud platforms such as Google Cloud Platform, AWS, or Azure.
  • Practical experience with containerization technologies, including Docker and Kubernetes. Familiarity with infrastructure as code tools like Terraform is essential.
  • Solid understanding of networking, security principles, and best SRE and security practices.
  • Proficiency in using monitoring and alerting tools to detect and respond to potential issues effectively
We are a diverse bunch of people and we want to continue to attract and retain a diverse range of people into our organization. We're committed to an inclusive and diverse company. We do not discriminate based on gender, ethnicity, sexual orientation, religion, civil or family status, age, disability, or race.

Similar Jobs

Hedra - Research Scientist

Hedra

San Francisco, California, United States (On-Site)
2 Weeks ago
Razer - Software Engineer (DevOps)

Razer

Shah Alam, Selangor, Malaysia (On-Site)
6 Months ago
ByteDance - GPU/AI Application System Software Engineer Intern

ByteDance

San Jose, California, United States (On-Site)
1 Month ago
ByteDance - Research Scientist Graduate (Foundation Model, Video Generation) - 2025 Start (PhD)

ByteDance

San Jose, California, United States (On-Site)
5 Months ago
Google - Staff Software Engineer, Google Cloud Compute

Google

Kirkland, Washington, United States (On-Site)
4 Months ago
PwC - ETIC, GCP Cloud Solution Architect - Senior Manager

PwC

Cairo, Cairo Governorate, Egypt (On-Site)
5 Months ago
Innoactive - Software Engineer

Innoactive

(Remote)
3 Months ago
NVIDIA - Senior DevOps Engineer, Deep Learning Frameworks

NVIDIA

Santa Clara, California, United States (On-Site)
2 Months ago
Rackspace Technology - Cloud NoSQL (MongoDB) & Graph Database Engineer IV

Rackspace Technology

India (Remote)
4 Weeks ago
Info Stretch - Programmer Analyst 5

Info Stretch

Lansing, Michigan, United States (Hybrid)
4 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

The Walt Disney Company - Sr Software Engineer

The Walt Disney Company

Washington, United States (On-Site)
1 Month ago
ByteDance - Research Scientist, Foundation Model, Music Intelligence

ByteDance

San Jose, California, United States (On-Site)
5 Months ago
Google - Software Engineer III, Infrastructure, Google Cloud Data Management

Google

Sunnyvale, California, United States (On-Site)
4 Months ago
TVH - Data Scientist

TVH

Pune, Maharashtra, India (On-Site)
6 Months ago
Genies - Lead Applied ML Engineer, Real-time 3D Asset Optimization

Genies

Los Angeles, California, United States (On-Site)
1 Month ago
Saviynt - Technical Lead, Support operations- JSON

Saviynt

Bengaluru, Karnataka, India (Hybrid)
5 Months ago
Tencent - Video Standards & Encoding Optimization Intern 103557

Tencent

Palo Alto, California, United States (On-Site)
3 Months ago
ByteDance - Senior Machine Learning Engineer - AML Algorithm

ByteDance

Seattle, Washington, United States (On-Site)
5 Months ago
Logitech - Firmware Engineering Manager (Gaming & Simulation)

Logitech

Chennai, Tamil Nadu, India (On-Site)
5 Months ago
ByteDance - Senior Research Scientist- Foundation Model, Vision and Language

ByteDance

San Jose, California, United States (On-Site)
5 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Bengaluru, Karnataka, India

Luxoft - Senior QA Analyst - AML & FinCrime

Luxoft

Pune, Maharashtra, India (On-Site)
4 Months ago
CloudHire - XM Cloud Developer

CloudHire

Telangana, India (Remote)
2 Months ago
Nagarro - Staff Consultant (Business Analyst - ServiceNow)

Nagarro

India (On-Site)
5 Months ago
Juego Studios - Senior Finance Executive

Juego Studios

Bengaluru, Karnataka, India (On-Site)
3 Months ago
ION - Markets Governance, Risk and Controls Manager

ION

India (On-Site)
5 Months ago
Now - Videographer and Motion graphic designer

Now

Bengaluru, Karnataka, India (On-Site)
5 Months ago
CloudHire - VBA Developer

CloudHire

Delhi, India (Remote)
2 Weeks ago
Gallagher - SPA - Content Developer and Visual Designer

Gallagher

Bengaluru, Karnataka, India (On-Site)
6 Months ago
PwC - IN_Senior Associate_Tableau Developer_Data & Analytics_Advisory_PAN India

PwC

Gurugram, Haryana, India (On-Site)
5 Months ago
Deutsche Bank - BM Analyst

Deutsche Bank

Mumbai, Maharashtra, India (On-Site)
6 Months ago

Get notifed when new similar jobs are uploaded

DevOps Jobs

NVIDIA - Senior Site Reliability Engineer - GPU Clusters

NVIDIA

Santa Clara, California, United States (On-Site)
2 Months ago
Haptic - Senior DevOps Engineer

Haptic

Paris, Île-de-France, France (Remote)
3 Months ago
Luxoft - Lead Integration and Release Engineer

Luxoft

Bucharest, Bucharest, Romania (On-Site)
4 Months ago
Brillio - .NET Azure Architect - R01525011

Brillio

Pune, Maharashtra, India (Hybrid)
5 Months ago
Codeninja - Azure Engineer

Codeninja

Mexico (Remote)
2 Months ago
Info Stretch - .Net Architect

Info Stretch

Philadelphia, Pennsylvania, United States (On-Site)
4 Months ago
Inworld AI - Staff Cloud DevOps/Site Reliability Engineer (SRE) - Canada

Inworld AI

Vancouver, British Columbia, Canada (On-Site)
4 Months ago
Social Discovery Group - ML Ops Engineer (AI Product)

Social Discovery Group

(Remote)
2 Months ago
Patterned Learning Career - Senior .NET Backend Engineer

Patterned Learning Career

(Remote)
1 Month ago
Anavation - Cloud Engineer

Anavation

Reston, Virginia, United States (On-Site)
3 Months ago

Get notifed when new similar jobs are uploaded