Cloud Site Reliability Engineer

undefined ago • 3 Years + • Devops

Job Summary

Job Description

We are seeking a Site Reliability Engineer to work with an innovative product launching in an area new to cloud services. This role involves acting as a gatekeeper for production, leading investigations into outages and performance issues, and developing automation for low-value tasks. The SRE will provide technical leadership to Cloud Operations and Support teams, focusing on developing and configuring monitoring dashboards, alerts, and observability platforms. Key responsibilities include deploying monitoring infrastructure using Bicep modules and configuring CI/CD pipelines in Azure DevOps.
Must have:
  • Act as part of a team of SRE’s that act as the ‘gatekeepers’ of production, and actively manage the work backlog and develop reliability improvements.
  • Lead investigations into root cause outages, performance, and cost issues.
  • Lead initiatives to develop the automation of low-value tasks balanced against project delivery demands.
  • Provide technical leadership and to wider Cloud Operations and Support teams along with providing oversight to the products and services they support.
  • Develop and configure monitoring dashboards and alerts in tools like Grafana and Azure Monitor.
  • Installation and configuration of Observability Platform including tools like Grafana, Prometheus, Azure Monitor, Open telemetry etc.
  • Developing bicep modules for monitoring infrastructure and deploy it.
  • Developing and configuring CI/CD pipelines in Azure Devops for deploying monitoring infrastructure and monitoring objects.
Good to have:
  • Be flexible with working hours when needed to address critical or urgent matters.
  • Be able to provide on-call services from time to time as needed.
  • Exposure to Azure DevOps pipelines (CI/CD).
  • Exposure to test frameworks (NUnit, Jasmine, Selenium).
Perks:
  • Join an ever-growing, market disrupting, global company.
  • Teams comprised of the best of the best.
  • Work in a fast-paced, collaborative, and creative environment.
  • Chance to learn and grow.
  • Endless internal career opportunities across multiple roles, disciplines, domains, and locations.
  • NiCE-FLEX hybrid model (2 days office, 3 days remote work).

Job Details

At NiCE, we don’t limit our challenges. We challenge our limits. Always. We’re ambitious. We’re game changers. And we play to win. We set the highest standards and execute beyond them. And if you’re like us, we can offer you the ultimate career opportunity that will light a fire within you.

So, what’s the role all about?

We are seeking a SRE to work with an innovative product launching in an area new to cloud services.

How will you make an impact?

  • Act as part of a team of SRE’s that act as the ‘gatekeepers’ of production, and actively manage the work backlog and develop reliability improvements.
  • Lead investigations into root cause outages, performance, and cost issues.
  • Lead initiatives to develop the automation of low-value tasks balanced against project delivery demands.
  • You will provide technical leadership and to wider Cloud Operations and Support teams along with providing oversight to the products and services they support.
  • Develop and configure monitoring dashboards and alerts in tools like Grafana and Azure Monitor.
  • Installation and configuration of Observability Platform including tools like Grafana, Prometheus, Azure Monitor, Open telemetry etc.
  • Developing bicep modules for monitoring infrastructure and deploy it.
  • Developing and configuring CI/CD pipelines in Azure Devops for deploying monitoring infrastructure and monitoring objects

Have you got what it takes?

  • Must have 3+ years of experience in Site Reliability Engineering
  • Excellent technical, analytical and troubleshooting skills
  • Experience and in-depth knowledge of databases and data handling (MS-SQL, Elasticsearch, YML, JSON, XML)
  • Significant experience in programming or advanced scripting (C#, PowerShell etc.)
  • Experience with infrastructure/configuration as code and version control (ARM, BICEP, Git)
  • Experience managing monitoring, alerting and dashboarding platforms (Azure Monitor, Prometheus, Grafana, Elasticsearch)
  • Demonstrable experience of supporting live cloud services and platforms
  • Production experience with Kubernetes and containerization
  • Implementation and support of service level objectives (SLOs)
  • Exposure to commercial cloud providers (Ideally Azure, others considered)
  • Exposure to Azure DevOps pipelines is desirable (CI/CD)
  • Exposure to test frameworks is desirable (NUnit, Jasmine, Selenium)
  • Efficient, effective, and respectful communication skills both with customers and within internal departments. Including,
  • Good listener, able to identify and validate assumptions.
  • Able to use effective questioning to confirm understanding of a customer problem and then provide help to solve it.
  • Methodical troubleshooting, technical skill and attention to detail used in diagnosing problems and reproducing issues in a local environment.
  • Multi-tasking and time-management to prioritise and switch between varied tasks.

You will have an advantage if you also have:

  • Be flexible with working hours when needed to address critical or urgent matters.
  • Be able to provide on-call services from time to time as needed.

What’s in it for you?

  • Join an ever-growing, market disrupting, global company where the teams – comprised of the best of the best – work in a fast-paced, collaborative, and creative environment! As the market leader, every day at NiCE is a chance to learn and grow, and there are endless internal career opportunities across multiple roles, disciplines, domains, and locations. If you are passionate, innovative, and excited to constantly raise the bar, you may just be our next NiCEr!

Enjoy NiCE-FLEX!

At NiCE, we work according to the NiCE-FLEX hybrid model, which enables maximum flexibility: 2 days working from the office and 3 days of remote work, each week. Naturally, office days focus on face-to-face meetings, where teamwork and collaborative thinking generate innovation, new ideas, and a vibrant, interactive atmosphere.

Requisition ID: 8094

Reporting into: Technical Manager /Director of Engineering

Role Type: Individual Contributor

Similar Jobs

bytedance - Software Engineer, AI Infrastructure

bytedance

Singapore (On-Site)
1 Week ago
NVIDIA - Senior Solutions Architect, Omniverse Platform

NVIDIA

Beijing, Beijing, China (On-Site)
4 Months ago
Pocket Worlds - Staff Backend Engineer - Infrastructure

Pocket Worlds

Poland (On-Site)
4 Months ago
Lambda - Senior Site Reliability Engineer - Fleet Reliability

Lambda

San Francisco, California, United States (Hybrid)
4 Months ago
Salesforce - Solution Architect - Tableau

Salesforce

Tokyo, Japan (Remote)
4 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Jobs in Pune, Maharashtra, India

CyberArk - Senior Software Engineer - Full Stack

CyberArk

India (On-Site)
2 Months ago
luxsoft - Senior Software Engineer - PCIe Driver Development

luxsoft

Hyderabad, Telangana, India (On-Site)
2 Months ago
world resource institute - Associate Program Director, Communications - Climate, Economics & Finance Program

world resource institute

Bengaluru, Karnataka, India (On-Site)
4 Weeks ago
SSC Technologies - Director Private Equity Fund Accounting

SSC Technologies

Mumbai, Maharashtra, India (On-Site)
1 Month ago
Cred - Operations Excellence

Cred

Bengaluru, Karnataka, India (On-Site)
11 Months ago
Global Business Travel - Software Development Engineer II (SRE)

Global Business Travel

Gurugram, Haryana, India (On-Site)
1 Year ago
Zeeco, Inc. - QA-QC Engineer (Global Support TO) - 2

Zeeco, Inc.

Mumbai, Maharashtra, India (On-Site)
1 Week ago
Capgemini - Order Management

Capgemini

Salem, Tamil Nadu, India (On-Site)
3 Months ago
Crowd Strick - Manager - Data Engineering

Crowd Strick

India (Remote)
2 Weeks ago
Barracuda - Senior Software Engineer

Barracuda

Bengaluru, Karnataka, India (Hybrid)
6 Months ago

Get notifed when new similar jobs are uploaded

Devops Jobs

Temporal Technologies - Staff Software Engineer, Cloud Identity

Temporal Technologies

United States (Remote)
1 Month ago
Xsolla - Software Architect

Xsolla

Los Angeles, California, United States (Hybrid)
2 Months ago
Zazz - IoT Solutions Architect

Zazz

(Remote)
6 Months ago
Google - Software Engineer III, Site Reliability Engineering

Google

Sunnyvale, California, United States (On-Site)
3 Months ago
Rippling - Frontend Engineer II - Ads Platform

Rippling

Bengaluru, Karnataka, India (On-Site)
6 Days ago
bytedance - Senior Software Engineer, Traffic Platform

bytedance

San Jose, California, United States (On-Site)
9 Months ago
Wooga - Site Reliability Engineer - Backend

Wooga

Berlin, Berlin, Germany (Hybrid)
1 Month ago
bytedance - Software Engineer Graduate (Multi-Cloud CDN)

bytedance

San Jose, California, United States (On-Site)
4 Months ago
Thousand Eyes - Senior Site Reliability Engineer I, Efficiency and Performance

Thousand Eyes

Bengaluru, Karnataka, India (On-Site)
2 Months ago
T systems - Cloud Engineer - Azure Admin

T systems

Pune, Maharashtra, India (On-Site)
2 Weeks ago

Get notifed when new similar jobs are uploaded

About The Company

Atlanta, Georgia, United States (Hybrid)

Sandy, Utah, United States (Hybrid)

Pune, Maharashtra, India (Hybrid)

Hoboken, New Jersey, United States (Hybrid)

Pune, Maharashtra, India (Hybrid)

Pune, Maharashtra, India (Hybrid)

Pune, Maharashtra, India (Hybrid)

Pune, Maharashtra, India (Hybrid)

Ra'anana, Center District, Israel (Hybrid)

View All Jobs

Get notified when new jobs are added by Nice

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug