Site Reliability Engineer

1 Day ago • All levels

Job Summary

Job Description

The Site Reliability Engineer (SRE) will focus on production AppOps, managing scalable systems using automation to improve reliability and velocity. They will support NCR’s Financial Services business unit to improve system design and operation, ensuring production performance and high availability in the cloud. The SRE will support innovation and operational improvement through software engineering practices, improving product usability and addressing emerging business needs. Key responsibilities include maintaining and scaling production services, bridging the gap between development and operations, improving service reliability, writing automation code, and participating in disaster recovery planning. The SRE will also implement monitoring alerts and manage escalation paths. They will also be accountable for prompt support and preparation of PIR/RCA during/for the critical incidents to help not only to remediate/resolve the problem but also to minimize the downtime window. Participating in on-call Rota/schedules, and during off-hours it may require providing assistance for production outage scenarios.
Must have:
  • Experience in a DevOps/SRE role with large-scale production environments.
  • Experience developing and debugging code (Ansible, Python, Shell, etc.).
  • 2+ years deploying scalable web applications/services.
  • 2+ years with Azure/GCP/AWS.
  • 2+ years with Docker, Kubernetes, and an early version of OpenShift.
  • Experience with Linux, Shell Scripting, PKI TLS/SSL, Network.
Good to have:
  • Experience with orchestration, automation, and configuration management tools.
  • Experience with Kubernetes, system virtualization, on-prem and/or hybrid cloud computing.
  • Experience with one or more CI/CD tools like Azure DevOps/Jenkins/GitHub Actions.
  • Experience with log management, including monitoring, aggregation, alerting, and graphing.
  • Experience with Kafka, Elasticsearch, or Cassandra.

Job Details

About NCR Atleos

NCR Atleos, headquartered in Atlanta, is a leader in expanding financial access. Our dedicated 20,000 employees optimize the branch, improve operational efficiency and maximize self-service availability for financial institutions and retailers across the globe.

TITLE: Site Reliability Engineer, G10

LOCATIONHyderabad, India

Summary:

We are looking for a Site Reliability Engineer (SRE), initially focused on production AppOps, who can manage scalable systems, using best practices around automation, that improve reliability, and velocity and enable monitoring of the operational health of services throughout their lifecycle including metrics collection, aggregation, and visualization.

As a member of the SRE team, you will support NCR’s Financial Services business unit, product, and technology teams to improve the design and operation of systems, focusing on making them scalable, reliable, and efficient while ensuring production performance and high availability of products/services primarily deployed/running in the cloud. You will influence the development and implementation of reliable production systems and services to address emerging business needs (such as Cloud-based SaaS). SREs pride themselves on the resiliency and stability of production systems, yet at the same time are committed to innovation and operational improvement through the application of software engineering practices to operations.


The SRE will support innovation and operational improvement through the application of software engineering practices to operations. You will make our products easier to adopt and use by making improvements to the product, tools, processes, and documentation. You are someone who strives for six 9’s or better in availability/uptime!

Key Areas of Responsibility (or where we need your support):

  • Maintenance, scale production services and servers for complex and high-throughput cloud services.
  • Bridge and own the union between development, quality, security, and operations.
  • Improving the scalability, service reliability, capacity, and performance of the SaaS services.
  • Writing automation code for provisioning and operating infrastructure at a massive scale.
  • To be an experienced software engineer focused on application reliability and scalability.
  • Contribution to the continuous improvement of our software delivery processes and practices in a multi-location, multidisciplinary team to empower and accelerate product development.
  • To design, configure, manage, and monitor systems in support of our product development teams.
  • To participate in disaster recovery planning and execution.
  • Maintaining/patching servers supporting SaaS products. This also includes Windows and Linux Servers running in private data centers and/or using cloud PaaS providers (Azure).
  • Collaborating with other teams to promote the code using CI/CD and AppSec tooling.
  • Accountable to collaborate with development/support/dependent teams and use intuition, experience and understanding to create SLIs, SLOs, and SLAs.
  • Responsible to implement monitoring alerts, build dashboards, and manage escalation paths.
  • Accountable for prompt support and preparation of PIR/RCA during/for the critical incidents to help not only to remediate/resolve the problem but also to minimize the downtime window.
  • Participate in on-call Rota/schedules, and during off-hours it may require providing assistance for production outage scenarios.

IDEAL TECHNICAL AND PROFESSIONAL SKILLS:   

  • BS degree in Computer Science or related technical field or 5 years prior relevant experience.
  • Extensive experience in a DevOps / SRE role with demonstrable experience in deploying and managing large-scale production environments in Azure, AWS, GCP, and multi-data center environments.
  • Experience developing and debugging code (i.e., one or more of the following: Ansible, Python, Shell, Perl, Golang or JavaScript, Java, C, C++, .NET)
  • 2+ years deploying and supporting high-traffic, scalable web applications/services.
  • 2+ years with Azure/GCP/AWS
  • 2+ years with Docker, Kubernetes, and an early version of OpenShift.
  • Experience with Linux, Shell Scripting, PKI TLS/SSL, Network, firewalls, load balancers and backup.
  • Experience in designing, analyzing, and running large-scale distributed systems.
  • Experience in hosting and solving problems in public-facing services securely in Azure, AWS or GCP
  • Experience with orchestration, automation, and configuration management tools like Ansible (or Puppet, Chef, Terraform, Helm or related technology), git and Fabric.
  • Excellent analysis, debugging, root-cause identification, and troubleshooting skills.
  • Experience with Kubernetes, system virtualization, on-prem and/or hybrid cloud computing, cloud Identity, security systems, cloud monitoring and logging, and/or local/cloud storage.
  • Experience with one or more CI/CD and related tools like Azure DevOps/Jenkins/GitHub Actions, Artifactory, Harness, CloudBuild.
  • Experience with application disaster recovery, migration, roll-back plans, expansion, routine deployments, and system upgrades.
  • Experience with log management, including monitoring, aggregation, alerting, and graphing (i.e., NagiosXI/Prometheus/ELK/Sensu/StackDriver/TICK stacks)
  • Bonus points for experience with Kafka, Elasticsearch, or Cassandra.
  • Extra bonus points for Cloud certifications and exposure to Harness.

Hybrid #LI-PS1

**Visit our careers site for a list of the benefits offered in your region in addition to a competitive base salary and strong work/family programs.

Statement to Third Party Agencies

To ALL recruitment agencies: NCR Atleos only accepts resumes from agencies on the NCR Atleos preferred supplier list.  Please do not forward resumes to our applicant tracking system, NCR Atleos employees, or any NCR Atleos facility. NCR Atleos is not responsible for any fees or charges associated with unsolicited resumes.

EEO Statement

NCR Atleos is an equal opportunity employer. It is NCR Atleos' policy to hire, train, promote and pay associates based on their job-related qualifications, ability, and performance, without regard to race, colour, creed, religion, national origin, citizenship status, sex, marital status, age, physical or mental disability, sexual orientation, or veteran status.

NOTE: Please review HR CMP Policy 420 concerning guidelines around internal employee transfers between roles.

EEO Statement : NCR Atleos is an equal-opportunity employer. It is NCR Atleos policy to hire, train, promote, and pay associates based on their job-related qualifications, ability, and performance, without regard to race, color, creed, religion, national origin, citizenship status, sex, sexual orientation, gender identity/expression, pregnancy, marital status, age, mental or physical disability, genetic information, medical condition, military or veteran status, or any other factor protected by law. 

Offers of employment are conditional upon passage of screening criteria applicable to the job.

EEO Statement
NCR Atleos is an equal-opportunity employer. It is NCR Atleos policy to hire, train, promote, and pay associates based on their job-related qualifications, ability, and performance, without regard to race, color, creed, religion, national origin, citizenship status, sex, sexual orientation, gender identity/expression, pregnancy, marital status, age, mental or physical disability, genetic information, medical condition, military or veteran status, or any other factor protected by law.


Statement to Third Party Agencies

To ALL recruitment agencies: NCR Atleos only accepts resumes from agencies on the NCR Atleos preferred supplier list. Please do not forward resumes to our applicant tracking system, NCR Atleos employees, or any NCR Atleos facility. NCR Atleos is not responsible for any fees or charges associated with unsolicited resumes.

Similar Jobs

Microsoft - Technical Support Engineer

Microsoft

Bengaluru, Karnataka, India (Hybrid)
1 Week ago
Extreme Inc. - Cloud Engineer

Extreme Inc.

Tokyo, Tokyo, Japan (Hybrid)
1 Day ago
ION - Cloud Engineer Kubernetes

ION

Rome, Lazio, Italy (Hybrid)
6 Months ago
Luxoft - Senior Software Support Engineer

Luxoft

Slovakia (Remote)
5 Months ago
Gaming Innovation Group  - DevOps Engineer

Gaming Innovation Group

St. Julian's, Malta (Hybrid)
1 Month ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Electronic Arts - Software Engineer III

Electronic Arts

Hyderabad, Telangana, India (Hybrid)
6 Hours ago
Hitachi - Data Science

Hitachi

Pune, Maharashtra, India (On-Site)
6 Months ago
Granicus - Senior Security Analyst

Granicus

Bengaluru, Karnataka, India (Hybrid)
6 Months ago
Boomi - Software Support Senior Engineer

Boomi

Vancouver, British Columbia, Canada (On-Site)
1 Day ago
Microsoft - Principal Software Engineer - IC3 Platform & AI Ops Engineering

Microsoft

Prague, Prague, Czechia (On-Site)
2 Weeks ago
Integral Ad Science - Staff Software Engineer

Integral Ad Science

Pune, Maharashtra, India (Hybrid)
6 Months ago
Adobe - Senior Engineering Manager, Media Services & Storage

Adobe

New York, New York, United States (Remote)
1 Month ago
The Walt Disney Company - Sr Software Engineer

The Walt Disney Company

Los Angeles, California, United States (On-Site)
2 Weeks ago
ByteDance - Software Engineer Intern (AIGC Platform - Monetization GenAI)

ByteDance

San Jose, California, United States (On-Site)
2 Weeks ago

Get notifed when new similar jobs are uploaded

Jobs in Hyderabad, Telangana, India

Zscaler - Product Support Engineer II

Zscaler

Bengaluru, Karnataka, India (Hybrid)
9 Hours ago
Paytm - KAM - Rajahmundry

Paytm

Andhra Pradesh, India (On-Site)
4 Months ago
PwC - Associate -SAP ABAP-Kolkata-TC

PwC

Kolkata, West Bengal, India (On-Site)
7 Months ago
PwC - IN-Manager-PLS-Health  Industries-Advisory-Mumbai

PwC

Mumbai, Maharashtra, India (On-Site)
4 Months ago
Capgemini - Business Advisor - A

Capgemini

Noida, Uttar Pradesh, India (On-Site)
23 Hours ago
Maersk Careers - Vendor Master Data Lead

Maersk Careers

Bengaluru, Karnataka, India (On-Site)
340 Years ago
In mobi - Senior Information Security Analyst (IT Risk)

In mobi

Bengaluru, Karnataka, India (On-Site)
1 Day ago
Digicore studios - Sales Coordinator

Digicore studios

Pune, Maharashtra, India (On-Site)
5 Months ago
Aspire - FinCrime Analyst, Periodic Reviews

Aspire

Gurugram, India (Hybrid)
8 Hours ago
Google - Networking Test Engineer

Google

Bengaluru, Karnataka, India (On-Site)
2 Days ago

Get notifed when new similar jobs are uploaded

Similar Category Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

About The Company

Help us bring innovation to financial institutions across the globe.At NCR Atleos,you’llhave meaningful and relevant workexperiences,withopportunities to learn and make a real contribution.We are dedicated to solving the challenges our customersfacethrough continuous innovation and a commitment tosettingthe highest standard in self-service banking.

Mumbai, Maharashtra, India (On-Site)

Hyderabad, Telangana, India (On-Site)

Hyderabad, Telangana, India (On-Site)

Hyderabad, Telangana, India (Hybrid)

Hyderabad, Telangana, India (On-Site)

Hyderabad, Telangana, India (On-Site)

Gurugram, Haryana, India (On-Site)

View All Jobs

Get notified when new jobs are added by NCR Atleos

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug