Manager, Site Reliability Engineering

4 Months ago • All levels • DevOps

Job Summary

Job Description

Lead and manage a team of Site Reliability Engineers responsible for building and maintaining Flexera's Snow Atlas platform infrastructure and tooling. Ensure the reliability, scalability, instrumentation, automation, and performance of Snow's cloud SaaS products. Advocate for SRE best practices and DevOps principles, influencing and evangelizing them across development, operations, and support teams. Manage operational reliability, fault tolerance, performance, scalability, observability, and efficiency of Flexera's cloud platforms and products.
Must have:
  • Site Reliability Engineering
  • Cloud Environments
  • Managing SRE Team
  • Azure Infrastructure
Good to have:
  • Kubernetes Infrastructure
  • Monitoring & Observability
  • IaC & Containers
  • CI/CD Tooling
Perks:
  • Remote Work
  • Team Growth

Job Details

Flexera saves customers billions of dollars in wasted technology spend. A pioneer in Hybrid ITAM and FinOps, Flexera provides award-winning, data-oriented SaaS solutions for technology value optimization (TVO), enabling IT, finance, procurement and cloud teams to gain deep insights into cost optimization, compliance and risks for each business service. Flexera One solutions are built on a set of definitive customer, supplier and industry data, powered by our Technology Intelligence Platform, that enables organizations to visualize their Enterprise Technology Blueprint™ in hybrid environments—from on-premises to SaaS to containers to cloud.

We’re transforming the software industry.  We’re Flexera.  With more than 50,000 customers across the world, were achieving that goal. But we know we can’t do any of that without our team Ready to help us re-imagine the industry during a time of substantial growth and ambitious plans?  Come and see why we’re consistently recognized by Gartner, Forrester and IDC as a category leader in the marketplace. Learn more at flexera.com

Build, grow and lead a team that is responsible for implementing the Site Reliability Engineering practices and tools that continually improve the operational readiness, instrumentation, reliability, performance and scalability of Flexera’s Snow Atlas global cloud infrastructure, platform and products. The team is central to the success of Flexera’s SaaS solutions and stakeholders will rely on your knowledge and expertise of SRE and DevOps practices.

Adopting DevOps principles of delivery, the manager is responsible for the deliverables of the central team and works with stakeholders to enable Site Reliability Engineers. The manager will engage with stakeholders to identify and deliver the highest value / priority work that improves SRE capabilities, tools and services. Generation of actionable insights from qualitative and quantitative metrics to continually improve the operational reliability of Snow’s systems.

What you will be doing:

  • Lead, manage and coach a team of Site Reliability Engineers (SREs) responsible for building and maintaining Flexera’s Snow Atlas platform infrastructure and tooling. Manage the day-to-day execution of high-quality, prioritized, deliverables of SRE best practices ensuring the reliability, scalability, instrumentation, automation and performance of Snow’s cloud SaaS products.
  • Being a passionate advocate of the SRE discipline and DevOps principles you will engage, influence, seek feedback, and evangelize best practices with development, operational and support teams to enable stakeholders to support self-service and “you build-it – you run it”.
  • Manage the operational reliability, fault-tolerance, performance, scalability, observability and efficiency of Flexera’s cloud platforms and products across environments.
  • Work on incidents in conjunction with team members and coordinating with wider stakeholders to resolve customer impacting service issues promptly.
  • Partners with security and other “shared services” teams to align, automate, integrate and orchestrate specialist tooling into a common set of SRE best practices that supports the wider Software Delivery Lifecycle and Product Lifecycle.
  • Plan and execute projects in support of the SRE objectives, and ensure projects are delivered with high quality, on time, and within budget
  • Hire, develop and retain a highly skilled SRE team
  • Evaluate hardware and software technologies to improve efficiency and performance

Responsibilities:

  • Manage a team responsible for supporting an international, 24x7, Azure cloud infrastructure powering Flexera’s customer facing service offerings
  • Participate in the design, implementation, and operation of a scalable and reliable systems infrastructure supporting a fast-growth SaaS offering
  • Ensure proper security, monitoring, alerting, and reporting for the infrastructure
  • Troubleshooting and resolving escalated issues
  • Capacity planning for all aspects of the infrastructure
  • Developing and maintaining processes, tools, and documentation in support of the production environment
  • Participate in evaluation of new software, hardware and infrastructure solutions
  • Participation in an on-call rotation and be available 24x7 in an escalation capacity

Required skills and knowledge:

  • Experience as a Site Reliability Engineering in cloud environments
  • Experience managing a team of Site Reliability Engineers
  • Experience managing infrastructure in Azure
  • Experience managing Kubernetes infrastructure in the cloud.
  • Experience in Monitoring & Observability practices in the cloud including tooling, logging, metrics, tracing, and alerting
  • Experience with IaC and Containers to achieve scalable, reliable, performant and secure SaaS platform infrastructure
  • Experience of CI/CD tooling to automate, orchestrate and integrate continuous delivery pipelines

Flexera is proud to be an equal opportunity employer.  Qualified applicants will be considered for open roles regardless of age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by local/national laws, policies and/or regulations. 

Flexera understands the value that results from employing a diverse, equitable, and inclusive workforce. We recognize that equity necessitates acknowledging past exclusion and that inclusion requires intentional effort. Our DEI (Diversity, Equity, and Inclusion) council is the driving force behind our commitment to championing policies and practices that foster a welcoming environment for all.

We encourage candidates requiring accommodations to please let us know by emailing careers@flexera.com.

Similar Jobs

Playrix - Senior C++ Software Engineer (Gameplay)

Playrix

Almaty, Almaty Region, Kazakhstan (Remote)
4 Months ago
Extreme Network - Sr Software Systems Engineer-Backend (Python, Microservices, Rest Gen AI)

Extreme Network

Bengaluru, Karnataka, India (Remote)
4 Months ago
ION - Senior QA Test Automation Engineer

ION

Dubai, Dubai, United Arab Emirates (Hybrid)
4 Months ago
SSC Technologies - Snr UI Developer

SSC Technologies

Bucharest, Bucharest, Romania (On-Site)
4 Months ago
Telesign - Site Reliability Engineer (SRE) III

Telesign

Bengaluru, Karnataka, India (On-Site)
4 Months ago
Steneral Consulting - Principal Cloud Core Infrastructure Engineer

Steneral Consulting

Raleigh, North Carolina, United States (Hybrid)
11 Months ago
Google - Staff Software Engineer, Site Reliability Engineering, Google Cloud

Google

Warsaw, Masovian Voivodeship, Poland (On-Site)
3 Months ago
Google - Delivery Executive IV, Professional Services, Google Cloud

Google

Atlanta, Georgia, United States (On-Site)
3 Months ago
Topsoe - Senior Software Engineer II

Topsoe

New Delhi, Delhi, India (On-Site)
4 Months ago
Nisum - DevOps Engineer - A6651

Nisum

Hyderabad, Telangana, India (Hybrid)
5 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

undefined - Solargraf - Devops  Engineer

Bengaluru, Karnataka, India (On-Site)
4 Months ago
growe - DevSecOps engineer

growe

(Remote)
4 Months ago
undefined - Director of Vercel Academy

United States (Remote)
4 Months ago
VGW - Principal Engineer

VGW

Krakow Am See, Mecklenburg-Vorpommern, Germany (Remote)
3 Months ago
Trendyol - DWH ETL Engineer

Trendyol

İzmir, İzmir, Türkiye (Hybrid)
4 Months ago
Whoop - Senior Software Engineer (MLOps)

Whoop

Boston, Massachusetts, United States (On-Site)
4 Months ago
Synechron - ETL Tester / Test Automation Architect

Synechron

Mulshi, Maharashtra, India (On-Site)
4 Months ago
Palo Alto Networks - Presales, Prisma Cloud Solutions Architect, Majors

Palo Alto Networks

Chicago, Illinois, United States (Remote)
3 Months ago
varonis-internal - Full stack Software Engineer

varonis-internal

Herzliya, Tel Aviv District, Israel (On-Site)
4 Months ago
Applike Group - iOS Developer (f/m/d)

Applike Group

Hamburg, Hamburg, Germany (Hybrid)
4 Months ago

Get notifed when new similar jobs are uploaded

Jobs in undefined

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

DevOps Jobs

Google - Cloud Architect, Databases, Professional Services, Google Cloud

Google

Gurugram, Haryana, India (On-Site)
3 Months ago
Reckitt - IT&D Platform Manager, DevOps, Global Integrations

Reckitt

Hyderabad, Telangana, India (Hybrid)
3 Months ago
PwC - Manager- Azure Architect|Bangalore

PwC

Bengaluru, Karnataka, India (On-Site)
4 Months ago
Genpact - Principal Consultant _ Release Manager – MS Dynamics 365-ITO084837

Genpact

Hyderabad, Telangana, India (On-Site)
3 Months ago
Blue Yonder - Staff Software Engineer II  (Java , Microservices)

Blue Yonder

Bengaluru, Karnataka, India (On-Site)
5 Months ago
PwC - IN_Senior Associate_System Admin-Data Platform _IN-IT Services Co_IFS_PAN INDIA

PwC

Gurugram, Haryana, India (On-Site)
4 Months ago
Google - Staff Software Engineer, Site Reliability Engineering, Google Cloud

Google

Warsaw, Masovian Voivodeship, Poland (On-Site)
3 Months ago
The Walt Disney Company - Manager, Software Engineering - Ads Data Infrastructure and Devops

The Walt Disney Company

Glendale, California, United States (On-Site)
3 Months ago
NBC universal - Manager, Playout Engineering

NBC universal

Englewood Cliffs, New Jersey, United States (Hybrid)
3 Months ago

Get notifed when new similar jobs are uploaded