Senior Cloud SRE

1 Week ago • 4-5 Years

About the job

SummaryBy Outscal

NICE Actimize Premier is seeking an Application and System monitoring Engineer to build a next-generation monitoring system. You'll be responsible for Grafana, Prometheus, Loki, Promtail, node exporter, Icinga 2, and system metrics monitoring. 5+ years of production cloud operations experience, 5+ years of expertise in Linux command line, 5+ years of using Terraform in AWS for automation, and 4+ years of experience with scripting using Python and Bash are must-haves.

About the job

At NICE, we don’t limit our challenges. We challenge our limits. Always. We’re ambitious. We’re game changers. And we play to win. We set the highest standards and execute beyond them. And if you’re like us, we can offer you the ultimate career opportunity that will light a fire within you.

So, what’s the role all about?

NICE Actimize Premier is seeking Application and System monitoring Engineer to take our existing CloudOps monitoring to the next level. In this position You will be working with multitude of modern tools and technologies to properly and efficiently build next generation of monitoring system as well as troubleshoot and resolve issues in our development, test and production environments.The ideal candidate has to have the ability to work in a dynamic and complex software build environment and will also be an energetic self-starter with a passion to build, innovate and achieve excellence.

What you will be doing?

  • Ability to design, implement and improve Grafana, Prometheus, Loki, Promtail, node exporter.
  • Log parsing and management.
  • Configuration of alerting, push notifications to VictorOps (now Splunk) and Email notifications.
  • Architect, design and Implement Icinga 2 monitoring and alerting.
  • Ability to monitor system metrics and log parsing.
  • Ability to automate tasks using bash and / or Python scripting.
  • Predictive monitoring of systems and applications.
  • Familiarity with JVM internals and using of JMX and REST for monitoring.
  • Familiarity with AWS infrastructure.
  • Deep understanding of Java applications, TLS, Apache.
  • Automated checks of performance of system metrics in Grafana.
  • Automated checks of performance of Web Applications.
  • Problem-solving and troubleshooting, including performing root cause analysis to design preventative activities.
  • Crafting and maintaining dashboards and reports, pulling together monitoring data across multiple platforms within the same tool as well as across multiple tools.
  • Assisting with writing scripts and queries that can provide environment self-healing capabilities.

Have you got what it takes?

  • Experience with using monitoring tools in a production environment.
  • 5+ years of production cloud operations experience
  • 5+ years expertise in Linux command line.
  • 5+ years of using Terraform in AWS for automation. Hands on with automation and seeking out opportunities to automate manual processes.
  • 5+ years of strong, hands-on experience building production services in AWS.
  • 4+ years of experience with scripting using Python and Bash
  • Ability to participate in on-call rotation
  • Considerable knowledge of IT equipment and diagnostic tools.
  • Considerable knowledge of principles and techniques of systems analysis, design, development and programming.
  • Considerable knowledge of principles of information systems.
  • Cnsiderable knowledge of capabilities of computer technology.
  • Knowledge of methods and procedures used to conduct detailed analysis and design of computer systems.
  • Knowledge of practices and issues of systems’ security and disaster recovery
  • Knowledge of computer operating systems.

What’s in it for you?

Join an ever-growing, market disrupting, global company where the teams – comprised of the best of the best – work in a fast-paced, collaborative, and creative environment! As the market leader, every day at NICE is a chance to learn and grow, and there are endless internal career opportunities across multiple roles, disciplines, domains, and locations. If you are passionate, innovative, and excited to constantly raise the bar, you may just be our next NICEr!

Enjoy NICE-FLEX!

At NICE, we work according to the NICE-FLEX hybrid model, which enables maximum flexibility: 2 days working from the office and 3 days of remote work, each week. Naturally, office days focus on face-to-face meetings, where teamwork and collaborative thinking generate innovation, new ideas, and a vibrant, interactive atmosphere.

Requisition ID: 4809

Reporting into: Tech Manager, Engineering

Role Type: Individual Contributor

About NICE

NICE Ltd. (NASDAQ: NICE) software products are used by 25,000+ global businesses, including 85 of the Fortune 100 corporations, to deliver extraordinary customer experiences, fight financial crime and ensure public safety. Every day, NICE software manages more than 120 million customer interactions and monitors 3+ billion financial transactions.

Known as an innovation powerhouse that excels in AI, cloud and digital, NICE is consistently recognized as the market leader in its domains, with over 8,500 employees across 30+ countries.

NICE is proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, national origin, age, sex, marital status, ancestry, neurotype, physical or mental disability, veteran status, gender identity, sexual orientation or any other category protected by law.

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug