Lead Site Reliability Engineer - Federal Team

Saviynt

8+ Years | Los Angeles, California, United States (Hybrid) | Full Time | 5 months ago

Apply Now

Job Summary

Saviynt is seeking a Lead Site Reliability Engineer for their Federal Team. This role involves performing customer deployments, migrations, and upgrades in cloud environments, installing and configuring Saviynt products, and troubleshooting incidents. Responsibilities include managing and maintaining cloud infrastructure on AWS, Azure, or Google Cloud, automating manual tasks, developing and maintaining CI/CD pipelines, and troubleshooting cloud-related infrastructure issues. The engineer will also automate infrastructure setup using Infrastructure as Code (IaC), collaborate with development and operations teams, maintain compliance with security and quality standards, and create technical documentation. The role requires designing and implementing novel solutions to automate cloud environment provisioning and developing automation scripts to streamline processes, reduce repetitive tasks, and eliminate human error. The position also involves configuring and deploying monitoring tools.

Must Have

U.S. Citizenship
8+ years experience in observability, SRE, or cloud platform roles
4+ years hands-on cloud experience (AWS, Azure)
3+ years experience in software development (Python, NodeJS, Java)
Advanced expertise in container orchestration (Kubernetes)
Hands-on experience with observability tools (Prometheus, Grafana, etc.)
Experience driving adoption of SLOs, SLIs, error budgets
Strong experience with IaC (Terraform, Helm)
Proven leadership in setting engineering standards

Good to Have

Meet US persons on US soil requirements
Undergo full background investigation/screening
Undergo IAL3 requirements

Perks & Benefits

Competitive total rewards package
Learning and tremendous opportunities to grow and advance
Discretionary bonus plan

Job Description

Saviynt is an identity authority platform built to power and protect the world at work. In a world of digital transformation, where organizations are faced with increasing cyber risk but cannot afford defensive measures to slow down progress, Saviynt’s Enterprise Identity Cloud gives customers unparalleled visibility, control and intelligence to better defend against threats while empowering users with right-time, right-level access to the digital technologies and tools they need to do their best work.

This opportunity is in the Saviynt Labs organization. We design, build and run the leading Enterprise Identity solutions. Our product teams innovate industry leading solutions. The engineering teams design, build and run SaaS software built on leading edge technologies. We focus on engineering excellence and we attract the best talent in our industry. Our cloud services are built on AWS, GCP and Azure with a global presence. Our customers love what we do and work with us to build the future customer experience at scale.

WHAT YOU WILL BE DOING

Perform customer deployments, migrations, and upgrades in the cloud environment.
Installing and configuring Saviynt product(s) following installation procedure and organizational guidelines
Troubleshooting and resolving incidents while collaborating with the development and IT teams to minimize downtime and maintain service quality
Manage and maintain cloud infrastructure on platforms such as AWS, Azure, or Google Cloud. Monitor cloud resources to ensure availability and scalability.
Automate any manual work being performed pre/during/post deployments.
Troubleshoot cloud-related infrastructure incidents and issues.
Develop and maintain CI/CD pipelines to ensure reliable and efficient software delivery. Monitor and troubleshoot issues within the CI/CD pipelines.
Automate infrastructure setup and maintenance using Infrastructure as Code (IaC) tools.
Collaborate with development, operations, and QA teams to improve deployment processes.
Maintain compliance with security and quality standards throughout the CI/CD pipeline
Creating and maintaining technical documents for cloud infrastructure and related processes.
Design and implement novel solutions to automate cloud-environment provisioning.
Developing automation solutions to streamline processes, such as creating scripts to run specific tasks on systems. Developing and implementing automation scripts to reduce repetitive tasks and eliminate human error.
Configuring and deploying monitoring tools

WHAT YOU BRING

U.S. Citizenship: Applicants must be United States citizens.
8+ years of professional experience in observability, SRE, or cloud platform roles, with demonstrated success in leading strategic initiatives and cross-team collaborations.
4+ years of hands-on cloud experience (AWS, Azure), with deep understanding of cloud-native architectures and observability practices.
Proven track record of designing and operating highly available and resilient systems in public cloud environments (especially AWS).
3+ years of experience in software development using Python, NodeJS, or Java, with strong focus on automation, CI/CD integration, and DevOps practices.
Advanced expertise in container orchestration platforms (Kubernetes) and service mesh technologies.
Hands-on experience implementing observability at scale using tools such as Prometheus, Grafana, OpenTelemetry, ELK/OpenSearch, Datadog, CloudWatch, or Azure Monitor.
Demonstrated success in driving adoption of SLOs, SLIs, error budgets, and automated alerting frameworks across engineering teams.
Strong experience with infrastructure as code (e.g., Terraform, Helm) and automated deployment pipelines.
Proven leadership in setting engineering standards, mentoring team members, and driving initiatives that reduce MTTD/MTTR and improve operational excellence.
Strong analytical skills, communication capabilities, and a strategic mindset to influence and guide technical direction across large-scale engineering teams.

Meet US persons on US soil requirements
Undergo full background investigation/screening
Undergo IAL3 requirements (Identity proofing to include I-9 document verification, biometric collection, and mailing address confirmation)

$135,000 - $180,000 a year

We offer you a competitive total rewards package, learning and tremendous opportunities to grow and advance in your career. At Saviynt, it is not typical for an individual to be hired at or near the top of the range for their role and final compensation

decisions are dependent on many factors including but not limited to location; skill sets; experience and training; licensure and certifications; and other relevant business and organizational needs. A reasonable estimate of the current range is $135,000 -

$180,000 annually.

You may also be eligible to participate in a Saviynt discretionary bonus plan, subject to the rules governing the program, whereby an award, if any, depends on various factors, including, without limitation, individual and organizational performance.

If required for this role, you will:

- Complete security & privacy literacy and awareness training during onboarding and annually thereafter

- Review (initially and annually thereafter), understand, and adhere to Information Security/Privacy Policies and Procedures such as (but not limited to):

- Data Classification, Retention & Handling Policy

- Incident Response Policy/Procedures

- Business Continuity/Disaster Recovery Policy/Procedures

- Mobile Device Policy

- Account Management Policy

- Access Control Policy

- Personnel Security Policy

- Privacy Policy

Saviynt is an amazing place to work. We are a high-growth, Platform as a Service company focused on Identity Authority to power and protect the world at work. You will experience tremendous growth and learning opportunities through challenging yet rewarding work that directly impacts our customers, all within a welcoming and positive work environment. If you're resilient and enjoy working in a dynamic environment, you belong with us!

Saviynt is an equal opportunity employer, and we welcome everyone to our team. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or veteran status.

18 Skills Required For This Role

Saas Business Models Account Management Problem Solving Talent Acquisition Quality Control Incident Response Aws Service Mesh Azure Prometheus Grafana Terraform Elk Helm Ci Cd Kubernetes Python Java