Community

Home >

Jobs >

Staff Site Reliability Engineering - Cloud

Masovian Voivodeship, Poland (Hybrid)

Staff Site Reliability Engineering - Cloud

1 Month ago • All levels • DevOps

About the job

14 skills required for this role

Test your skills to join the top 1% applicants for this job

java

shell

scala

kubernetes

spark

grafana

rust

splunk

python

hadoop

design-patterns

mysql

team-player

networking

Job Description

As a Staff Site Reliability Engineer - Cloud at Visa, you'll be responsible for ensuring the stability, availability, performance, and efficiency of Visa's applications and systems. You will be involved in both operational duties like on-call support and proactive maintenance, as well as developing systems and software to enhance site reliability. This role will involve collaborating with teams to identify design gaps, perform root cause analysis, and implement permanent solutions to ensure the smooth and uninterrupted operation of Visa's platforms. You'll also have the opportunity to design dashboards, alerts, and automation tools to improve monitoring and problem resolution.

Must have:

Hands-on experience with Hadoop Ecosystem (Kafka, Spark, Scala, Hive)
Strong understanding of Cloud Technologies, Kubernetes, AI, MLOPS
Proficient in Core Java
Experience with Golang and Rust
Experience with OO design and design patterns
Knowledge of Networking protocols and the OSI stack
Hands-on experience with DB2, MySQL, or equivalent databases
Ability to analyze thread dumps, heap dumps, garbage collection, and other JVM components
Strong problem-solving and debugging skills
Knowledge of production support processes like incident/change/problem management, call triaging, escalation procedures
Experience in Shell Scripting and Python
Basic knowledge of Akamai/Cloudflare and Active-Active setup Application
Deep understanding of SOA principles and Web Services technologies: REST & SOAP
Experience with Web Application Development
Knowledge of observability tools like Grafana, Opera, and Splunk
Strong teamwork and collaboration skills

Good to have:

Knowledge of the Payment Industry

Not hearing back from companies?

Unlock the secrets to a successful job application and accelerate your journey to your next opportunity.

Company Description

Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more than 200 countries and territories each year. Our mission is to connect the world through the most innovative, convenient, reliable, and secure payments network, enabling individuals, businesses, and economies to thrive while driven by a common purpose – to uplift everyone, everywhere by being the best way to pay and be paid.

Make an impact with a purpose-driven industry leader. Join us today and experience Life at Visa.

Job Description

As a part of the Product Reliability Engineering (PRE) Organization of VISA , you will be responsible for availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning. In this role, your time will be split between operations/on-call duties and developing systems and software that help increase site reliability and performance. Site reliability engineering (SRE) fuses the software engineering and operations disciplines.

You will have the opportunity to participate in design reviews and identify design gaps, detect problems, perform root cause analysis, identify, and implement permanent solutions. You will also have the opportunity to bring your creative curiosity and technology experience to work every day and help build dashboards, audits, alerts, and solutions.

Major Responsibilities

-The Product Reliability Engineering (PRE) group prides itself in keeping the applications and systems of Visa up and running to cater to the 24*7 needs of the business.

-Support critical applications and ensure the stability of the applications by performing proactive maintenance activities, engage in automation activities, do root cause analysis and remediation.

-Ability to write and maintain scripts to monitor system activity including application smoke test activities during production implementations.
-Develop and maintain automation tools to handle all engineering and application support tasks.

-Engage in production issue troubleshooting bridge calls, provide immediate service restoration, follow up on root cause analysis, and ensure permanent fix is implemented to avoid similar problems in the future.
-Coordinate and execute application releases and production deployments.

-Must be flexible to working in shifts and during weekends and extended hours when needed.

This is a hybrid position. Hybrid employees can alternate time between both remote and office. Employees in hybrid roles are expected to work from the office 2-3 set days a week (determined by leadership/site), with a general guidepost of being in the office 50% or more of the time based on business needs.

Qualifications

Preferred Qualifications

-Hands on in Hadoop Ecosystem (Kafka, Spark, Scala, Hive) is a plus.
-Knowledge of Cloud Technologies, Kubernetes, AI, MLOPS will be
advantageous.
-Knowledge/Experience in Core Java.
-Knowledge/Experience in Golang, Rust.
-Experience in OO design and design patterns is plus.
-Experience on understanding Networking protocols and OSI stack.
-Hands on experience on DB2, MySQL, or equivalent database. Ability to
understand and write complex Database queries.
-Ability to analyze thread dumps, heap dumps, garbage collection and other
related JVM components.
-Ability to solve complex production problems and debug code.
-Working knowledge of production support processes such as
incident/change/problem management, call triaging, escalation procedures
and such.
-Experience in Shell Scripting.
-Experience in Python.
-Basic level knowledge on Akamai/Cloudflare, Active-Active setup Application.
-Deep understanding of SOA principles and Web Services technologies: REST &
SOAP.
-Experience in Web Application Development is a plus.
-Knowledge on observability tools like Grafana, Opera and Splunk.
-We count on your curiosity and creativity, to understand customer requirements
and our processes and come up with creative solutions and improvements.
-You have the passion to work with people and help mentor your juniors to reach
new heights.
-Strong Team player and ability to collaborate as part of virtual team across the
globe.
-Ability to prioritize and perform multiple tasks simultaneously and deliver on
time with quality.
-Strong self-motivation, sense of ownership and ability to retain focus under
stress during crisis situation.
-Knowledge of Payment Industry is a plus.

Additional Information

Visa is an EEO Employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability or protected veteran status. Visa will also consider for employment qualified applicants with criminal histories in a manner consistent with EEOC guidelines and applicable local law.

View Full Job Description

Upload your resume, increase your shortlisting chances by 80%