Cloud SRE

1 Minute ago • 4 Years + • Devops

Job Summary

Job Description

As a Site Reliability Engineer (SRE) for our large and regionally distributed SaaS platform, your primary responsibilities will be to improve the reliability and availability of our mission-critical cloud-based services. You will create dashboards and metrics for observability, consult with development teams on SRE best practices, and automate tasks to reduce manual intervention. Additionally, you will assist in incident and problem management, share knowledge, mentor other SREs, and ensure compliance with processes and documentation.
Must have:
  • Create new dashboards and metrics for comprehensive observability, including SLI/SLO metrics.
  • Work with development teams to ensure proper monitoring is set up and enabled.
  • Identify evolutionary improvements to observability and monitoring solutions.
  • Consult with development teams on SRE services and best practices.
  • Create automation and tooling to reduce toil and manual intervention.
  • Assist other teams in data and performance analysis to identify root causes.
  • Review the work of other SREs and provide training and guidance.
  • Communicate effectively with both technical and non-technical peers and customers.
  • Follow established processes or help document and create new ones as necessary.
  • Document troubleshooting steps and results in appropriate locations.
  • Ensure compliance with policies, procedures, and standards.
  • Implement or coordinate remediation required by audits and assessments.
  • Estimate the time required to complete activities and projects.
Good to have:
  • Kubernetes
  • Kubernetes certification
  • Grafana
  • AWS
  • Azure
  • DevOps experience
Perks:
  • Fast-paced, collaborative, and creative environment
  • Learning and growth opportunities
  • Endless internal career opportunities across multiple roles, disciplines, domains, and locations
  • NICE-FLEX hybrid model (2 days office, 3 days remote work each week)

Job Details

At NiCE, we don’t limit our challenges. We challenge our limits. Always. We’re ambitious. We’re game changers. And we play to win. We set the highest standards and execute beyond them. And if you’re like us, we can offer you the ultimate career opportunity that will light a fire within you.

So, what’s the role all about?

As a Site Reliability Engineer (SRE) for our large and regionally distributed SaaS platform, your primary responsibilities will be to improve the reliability and availability of our mission-critical cloud-based services.

How will you make an impact?

Essential Duties and Responsibilities:

1. Observability and Monitoring:

  • Create new dashboards and metrics to provide comprehensive observability into the health and performance of development teams' applications, including SLI/SLO metrics.
  • Work with development teams to ensure proper monitoring is set up and enabled for their services.
  • Identify evolutionary improvements to the observability and monitoring solutions.

3. Reliability Consulting and Automation:

  • Consult with development teams on SRE services and best practices to help them improve the reliability of their applications.
  • Create automation and tooling to reduce toil and manual intervention.

5. Incident and Problem Management:

  • Assist other teams in data and performance analysis to identify the root causes of issues and recommend automation actions.

7. Knowledge Sharing and Mentoring:

  • Review the work of other SREs and provide training and guidance to help them improve their skills.
  • Communicate effectively with both technical and non-technical peers and customers.

9. Process and Documentation:

  • Follow established processes when performing work or help document and create processes, as necessary.
  • Document troubleshooting steps and results in appropriate locations for historical access.
  • Ensure compliance with policies, procedures, and standards.
  • Implement or coordinate remediation required by audits and assessments, and document, as necessary.

11. Time Estimation:

  • Estimate the time required to complete activities and projects.

Have you got what it takes?

  • 4+ years programming/scripting experience with any of the following: (Go, Python, .Net (C#), Node)
  • 4+ years of experience working within public or private cloud environments
  • 4+ years of SRE/DevOps/Observability or related experience
  • 4+ years of AWS
  • Experience with Agile, Jira, GitHub, monitoring, automation, dashboarding

You will have an advantage if you also have:

Kubernetes + certification, Grafana, AWS, Azure, DevOps experience.

What’s in it for you?

Join an ever-growing, market disrupting, global company where the teams – comprised of the best of the best – work in a fast-paced, collaborative, and creative environment! As the market leader, every day at NICE is a chance to learn and grow, and there are endless internal career opportunities across multiple roles, disciplines, domains, and locations. If you are passionate, innovative, and excited to constantly raise the bar, you may just be our next NICEr!

Enjoy NICE-FLEX!

At NICE, we work according to the NICE-FLEX hybrid model, which enables maximum flexibility: 2 days working from the office and 3 days of remote work, each week. Naturally, office days focus on face-to-face meetings, where teamwork and collaborative thinking generate innovation, new ideas, and a vibrant, interactive atmosphere.

Requisition ID:7547

Reporting into: Manager, Cloud Operations

Role Type: Individual Contributor

Similar Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Similar Skill Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Jobs in Pune, Maharashtra, India

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Devops Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

About The Company

Pune, Maharashtra, India (Hybrid)

Alkmaar, North Holland, Netherlands (Hybrid)

Atlanta, Georgia, United States (Hybrid)

Columbus, Ohio, United States (Hybrid)

Pune, Maharashtra, India (Hybrid)

Pune, Maharashtra, India (Hybrid)

Manila, Metro Manila, Philippines (Hybrid)

Pune, Maharashtra, India (Hybrid)

Ra'anana, Center District, Israel (Hybrid)

View All Jobs

Get notified when new jobs are added by Nice

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug