Lead site reliability engineer

1 Month ago • All levels • Devops

Job Summary

Job Description

The Site Reliability Engineer Lead will oversee support operations and site reliability engineering tasks, ensuring the effective functioning of systems and applications. Key responsibilities include managing a team, monitoring system performance, collaborating with cross-functional teams, developing incident response procedures, conducting audits, leading automation implementation, and providing technical guidance. This role focuses on enhancing system performance, availability, and resiliency. The candidate should have experience with monitoring tools, containerization technologies, and strong project management skills. This role may also be eligible for performance-based bonuses subject to company policies. In addition, this role is eligible for the following benefits subject to company policies: medical, dental, vision, pharmacy, life, accidental death & dismemberment, and disability insurance; employee assistance program; 401(k) retirement plan; 10 days of paid time off per year (some positions are eligible for need-based leave with no designated number of leave days per year); and 10 paid holidays per year.
Must have:
  • Proficiency in site reliability engineering (SRE) principles and practices.
  • Strong background in system administration, networking, and cloud computing.
  • Experience with monitoring tools such as Prometheus, Grafana, and ELK stack.
  • Knowledge of containerization technologies like Docker and Kubernetes.
  • Ability to troubleshoot complex technical issues and perform root cause analysis.
  • Excellent communication skills and ability to work collaboratively in a team environment.
  • Strong project management and leadership skills to drive initiatives efficiently.
Good to have:
  • Certifications in relevant areas such as AWS Certified DevOps Engineer or Google Professional Cloud DevOps Engineer are a plus.
Perks:
  • Medical, dental, vision, pharmacy, life, accidental death & dismemberment, and disability insurance
  • Employee assistance program
  • 401(k) retirement plan
  • 10 days of paid time off per year
  • 10 paid holidays per year

Job Details

Job description:

About HCLTech
HCLTech is a global technology company, spread across 60 countries, delivering industry-leading capabilities centered around digital, engineering, cloud and AI, powered by a broad portfolio of technology services and products. We work with clients across all major verticals, providing industry solutions for Financial Services, Manufacturing, Life Sciences and Healthcare, Technology and Services, Telecom and Media, Retail and CPG, and Public Services. We re powered by our people a global, diverse, multi-generational talent - representing 161 nationalities whose unique spark, perspective and boundless passion drive our culture of proactive value creation and problem-solving.
Our purpose is to bring together the best of technology and our people to supercharge progress for everyone, everywhere our clients, partners, their stakeholders, communities, and the planet. As a company, we are deeply focused on accelerating our ESG agenda. We are also creating technology-enabled sustainable solutions with and for our clients and partners. We embed ESG imperatives into every aspect of our business and ensure that the progress we supercharge is responsible, inclusive and beneficial to all our stakeholders in the long term. We have committed to achieving net zero by 2040.

To learn more about how we can supercharge progress for you, visit www.hcltech.com

Site Reliability Engineer Lead

Job Summary
The Support Lead (SRE) is responsible for overseeing the support operations and site reliability engineering tasks, ensuring the effective functioning of systems and applications. The primary goal is to enhance system performance, availability, and resiliency.

  • Key Responsibilities
    1. Manage a team of support engineers and sres to provide technical support and address system issues promptly.
    2. Monitor system performance and reliability metrics, identifying areas for improvement and implementing solutions.
    3. Collaborate with cross functional teams to optimize application performance and enhance system reliability.
    4. Develop and maintain incident response procedures and protocols to minimize system downtime.
    5. Conduct regular audits and assessments to ensure compliance with industry standards and best practices.
    6. Lead the implementation of automation tools and processes to streamline support operations and enhance efficiency.
    7. Provide technical expertise and guidance to team members, promoting a culture of continuous learning and development.

    Skill Requirements
    1. Proficiency in site reliability engineering (sre) principles and practices.
    2. Strong background in system administration, networking, and cloud computing.
    3. Experience with monitoring tools such as prometheus, grafana, and elk stack.
    4. Knowledge of containerization technologies like docker and kubernetes.
    5. Ability to troubleshoot complex technical issues and perform root cause analysis.
    6. Excellent communication skills and ability to work collaboratively in a team environment.
    7. Strong project management and leadership skills to drive initiatives and deliver results efficiently.
    8. Certifications in relevant areas such as aws certified devops engineer or google professional cloud devops engineer are a plus.

Similar Jobs

Coda - Partner Enablement Specialist

Coda

Shanghai, China (Hybrid)
1 Month ago
Motorola solutions - Senior Account Manager - Software Sales (West Region)

Motorola solutions

(Remote)
1 Month ago
smarsh - Sr. Financial Analyst

smarsh

Atlanta, Georgia, United States (Hybrid)
1 Month ago
Capgemini - Service Delivery Manager

Capgemini

Bengaluru, Karnataka, India (On-Site)
1 Month ago
Crunchyroll - Senior Frontend Developer, Conversion & Purchase Experience

Crunchyroll

San Francisco, California, United States (Remote)
7 Months ago
Globalization Partners - Principal Solution Architect

Globalization Partners

United States (Remote)
1 Month ago
Rackspace Technology - Machine Learning Architect (AWS)

Rackspace Technology

(Remote)
2 Months ago
miniclip - Senior Infrastructure Cloud Engineer

miniclip

Lisbon, Lisbon, Portugal (On-Site)
4 Weeks ago
version 1 - Oracle Cloud Infrastructure (OCI) Architect

version 1

Dublin, County Dublin, Ireland (Hybrid)
2 Months ago
bytedance - Senior Security Software Architect - Security Engineering - San Jose

bytedance

San Jose, California, United States (On-Site)
6 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Riot Games - Staff Software Engineer (Build) - Teamfight Tactics

Riot Games

Los Angeles, California, United States (On-Site)
5 Months ago
velotio technologies  - Principal Engineer (Golang)

velotio technologies

Pune, Maharashtra, India (Remote)
1 Month ago
Ariens Company - Tool & Die Tech 1st shift

Ariens Company

Fayetteville, Tennessee, United States (On-Site)
1 Month ago
WebFX - Jr. Business Data Analyst

WebFX

Harrisburg, Pennsylvania, United States (On-Site)
8 Months ago
lifechruh - Generosity Pastor

lifechruh

Edmond, Oklahoma, United States (On-Site)
1 Month ago
Site Core - Manager, Customer Value Realization

Site Core

London, England, United Kingdom (On-Site)
1 Month ago
frames store - FREELANCE: VFX PRODUCERS - CHICAGO

frames store

Chicago, Illinois, United States (On-Site)
1 Year ago
Capgemini - Industrial Operations Engineer - B

Capgemini

Bengaluru, Karnataka, India (On-Site)
1 Month ago
LeoVegas - Strategic Initiatives Manager

LeoVegas

Leeds, England, United Kingdom (Hybrid)
1 Month ago
Scopely - Global Sr. Payroll Specialist

Scopely

Barcelona, Catalonia, Spain (Hybrid)
3 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Illinois, United States

Glean - Sales Development Representative

Glean

Palo Alto, California, United States (Hybrid)
1 Month ago
NVIDIA - Senior Emulation Power Engineer

NVIDIA

Santa Clara, California, United States (On-Site)
5 Months ago
Publicis Groupe - Mental Health Technician (PM Shift)

Publicis Groupe

Crownsville, Maryland, United States (On-Site)
1 Week ago
Optiv - Sr. Ping Engineer - TS/SCI with FS Poly

Optiv

Herndon, Virginia, United States (On-Site)
1 Month ago
OKX - Operational Senior Audit Manager, US

OKX

San Jose, California, United States (Hybrid)
1 Month ago
Aristocrat - Field Service Technician

Aristocrat

San Diego, California, United States (Hybrid)
3 Weeks ago
Nexon - Manager, Art Design

Nexon

El Segundo, California, United States (Hybrid)
4 Days ago
Toast - Fintech Compliance Manager, Banking

Toast

United States (Remote)
1 Week ago
Nagarro - Associate Staff Consultant, Business Analyst

Nagarro

New York, New York, United States (On-Site)
8 Months ago
Unbroken Studios - Animal Care Assistant

Unbroken Studios

Portsmouth, Ohio, United States (On-Site)
3 Days ago

Get notifed when new similar jobs are uploaded

Devops Jobs

NVIDIA - Senior Solutions Architect, Omniverse Platform

NVIDIA

Beijing, Beijing, China (On-Site)
3 Months ago
NVIDIA - Senior Solutions Architect, Retail

NVIDIA

Arkansas, United States (Remote)
2 Months ago
hogarth - Software Architect

hogarth

Bucharest, Bucharest, Romania (Hybrid)
1 Month ago
bytedance - Solutions Architect

bytedance

Gurugram, Haryana, India (On-Site)
2 Months ago
Apple - Sr. Software Engineer - Cloud Platform, Kubernetes (ASE)

Apple

Cupertino, California, United States (On-Site)
3 Weeks ago
Workato - AI Solutions Architect

Workato

Chennai, Tamil Nadu, India (On-Site)
1 Month ago
Egnyte - Junior DevOps Engineer

Egnyte

Poznań, Greater Poland Voivodeship, Poland (On-Site)
2 Months ago
NCR Atleos - Site Reliability Engineer

NCR Atleos

Hyderabad, Telangana, India (Hybrid)
1 Month ago
bytedance - Senior Software Engineer - Stability Platform

bytedance

Singapore (On-Site)
7 Months ago
bytedance - Senior Software Engineer - Serverless Compute Infrastructure

bytedance

Seattle, Washington, United States (On-Site)
4 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Noida, Uttar Pradesh, India (On-Site)

Michigan, United States (On-Site)

California, United States (On-Site)

Washington, United States (On-Site)

Noida, Uttar Pradesh, India (On-Site)

Sofia, Sofia City Province, Bulgaria (On-Site)

Stockholm, Stockholm County, Sweden (On-Site)

Pune, Maharashtra, India (On-Site)

New Jersey, United States (On-Site)

Florida, United States (On-Site)

View All Jobs

Get notified when new jobs are added by HCL Tech

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug