Staff Engineer - DevOps Site Reliability

2 Months ago • All levels • DevOps

Job Summary

Job Description

Experienced L3 SRE engineer needed for a business-critical SaaS application. Responsibilities include L3 support across the full stack (infra, backend, frontend), automating SRE tools, proactive monitoring, handling business pressure, communicating effectively with various teams and end-users, incident/problem management, and working with multitenant applications. Requires strong understanding of networking, CI/CD, Python, and AWS services (especially EKS, serverless technologies, and databases). Experience with Kubernetes, Prometheus, and monitoring/logging tools is essential.
Must have:
  • EKS
  • Github Actions
  • Python (Strong)
  • Kubernetes (Expert)
  • Prometheus
  • L3 support across full-stack
  • Automation of SRE tools
  • Incident/Problem Management
Good to have:
  • GenAI/LLM application experience
  • AWS Managed Services
  • FastAPI and NextJS
  • Websockets
  • Cloud security concepts
  • Terraform

Job Details

Company Description

We are a Digital Product Engineering company that is scaling in a big way! We build products, services, and experiences that inspire, excite, and delight. We work at scale — across all devices and digital mediums, and our people exist everywhere in the world (19000+ experts across 33 countries, to be exact). Our work culture is dynamic and non-hierarchical. We are looking for great new colleagues. That is where you come in!

Job Description

  • Experienced L3 SRE engineer based on business-critical SaaS application.
  • Capacity to L3 across the full stack including infra, backend and front-end, before escalation to engineering business unit.
  • Capacity to automate SRE tools to provide proactive.
  • L3 support, close to our tech monitoring strategy.
  • Capacity to work under business pressure for business critical applications.
  • Capacity to communicate accordingly with L1,L2, Engineering, Product managers, leadership and end-users during troubleshooting.
  • Capacity to communicate accordingly.
  • Experience with incident and problem management.
  • Experience with multitenant applications.
  • Solid understanding of networking concepts(TCP/IP, DNS, Routing, etc) like VPCs, subnets, firewalls, and load balancing, TLS and SSL.
  • Experience with CI/CD pipelines (e.g., Jenkins, Github Actions) & version control.
  • Python, react/next.
  • Monitoring and logging to analyze & track resource utilization, application performance, and identify potential issues, Grafana, Prometheus, Loki or ELK.
  • Experience with AWS, particularly EKS, serverless, queue & various databases.
  • Solid knowledge Kubernetes.

Qualifications

Must have Skills: EKS, Github Actions, Python (Strong), Kubernetes (Expert), Prometheus.

Good to Have Skills: 

  • Previous experience building a user-facing GenAI/LLM software application.
  • Security best practices in cloud environments. - AWS Managed Services (RDS, Batch, Lambda, Fargate, Step Functions, SQS/SNS, etc.).
  • FastAPI and NextJS experience (if we're still using the latter).
  • Websockets, Server-Side Events, Pub/Sub (RabbitMQ, Kafka, etc.).
  • Cloud security concepts (IAM, access control).
  • Terraform experience. 

Similar Jobs

Rackspace Technology - SOC Analyst L3 (Sentinel is mandatory) - R-19060

Rackspace Technology

Gurugram, Haryana, India (Hybrid)
6 Months ago
Voodoo - Senior Backend Engineer Golang - BeReal

Voodoo

Paris, Île-de-France, France (On-Site)
8 Months ago
Mixmob - Senior Full-Stack React/Node & NFT Gaming Developer

Mixmob

Vancouver, British Columbia, Canada (Remote)
9 Months ago
N-iX - Senior AQA Engineer

N-iX

Colombia (Remote)
1 Week ago
N-iX - Senior Java Engineer

N-iX

Colombia (Remote)
1 Week ago
NVIDIA - Senior Product Security Engineer

NVIDIA

Pune, Maharashtra, India (On-Site)
5 Days ago
Turbulent - Senior DevOps Engineer

Turbulent

Montreal, Quebec, Canada (On-Site)
3 Weeks ago
Ajmera Infotech - SENIOR ASP.NET DEVELOPER

Ajmera Infotech

Bengaluru, Karnataka, India (On-Site)
9 Months ago
Google - Software Engineer III, Infrastructure and Operations

Google

Kraków, Lesser Poland Voivodeship, Poland (On-Site)
4 Days ago
Nagarro - Senior Engineer, DevOps

Nagarro

India (Remote)
6 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

N-iX - Middle Full-Stack Engineer (React Native + NodeJS)

N-iX

Ukraine (Remote)
1 Week ago
Matific - Senior DevOps Engineer/Lead DevOps

Matific

São Paulo, State Of São Paulo, Brazil (On-Site)
2 Weeks ago
N-iX - Lead Full Stack Engineer (.NET+React) (#2638)

N-iX

Colombia (Remote)
3 Months ago
Progress - Senior Software Engineer

Progress

Sofia, Sofia City Province, Bulgaria (Hybrid)
5 Months ago
The Walt Disney Company - Software Engineer, Test

The Walt Disney Company

Emeryville, California, United States (On-Site)
1 Month ago
N-iX - Senior .NET Engineer

N-iX

Colombia (Remote)
2 Months ago
Easygo - Software Engineer - Front-End Full Stack (Mid-Senior)

Easygo

Sydney, New South Wales, Australia (On-Site)
3 Weeks ago
Rackspace Technology - Frontend Engineer (UX-Focused)

Rackspace Technology

Gurugram, Haryana, India (Remote)
1 Month ago
Hawk Eye Innovations - Mid-Level Java Engineer

Hawk Eye Innovations

Budapest, Hungary (Hybrid)
4 Weeks ago
Epic Games - Senior QA Engineer

Epic Games

(On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Colombia

N-iX - Senior Product Owner

N-iX

Colombia (Remote)
1 Week ago
N-iX - HR Business Partner

N-iX

Colombia (Remote)
2 Months ago
N-iX - Senior Fullstack .NET + React Engineer

N-iX

Colombia (Remote)
1 Week ago
Evolution - Equipment Support Specialist

Evolution

Medellín, Antioquia, Colombia (On-Site)
10 Months ago
N-iX - HR Business Partner

N-iX

Medellín, Antioquia, Colombia (Remote)
1 Week ago
N-iX - Senior/Lead Full Stack Engineer (.NET+React)

N-iX

Colombia (Remote)
2 Months ago
Anthology  Inc  - Professional Services Director

Anthology Inc

Colombia (Remote)
1 Month ago
Teravision Games - Lead Programmer

Teravision Games

Bogotá, Bogota, Colombia (Hybrid)
2 Months ago
LeoVegas - Live Trader

LeoVegas

Medellín, Antioquia, Colombia (On-Site)
5 Months ago
N-iX - Senior AQA Engineer (Python + Robot)

N-iX

Colombia (Remote)
2 Months ago

Get notifed when new similar jobs are uploaded

DevOps Jobs

Warner Bros Games - Staff Software Engineer - Database Engineer with Aurora Postgres

Warner Bros Games

Bengaluru, Karnataka, India (Hybrid)
3 Weeks ago
PlayStation Global - IT Support Engineer II

PlayStation Global

United Kingdom (Remote)
4 Weeks ago
Tencent - Tencent Cloud - Technical Account Manager (South Korea)

Tencent

Seoul, South Korea (On-Site)
4 Months ago
Anthology  Inc  - DevOps (SRE) Engineer

Anthology Inc

Brno, South Moravian Region, Czechia (On-Site)
6 Months ago
USE Insider - DevOps Engineer

USE Insider

İstanbul, İstanbul, Türkiye (Remote)
4 Months ago
Google - Software Engineering Manager, Privacy Sandbox, Cloud Computing

Google

Kraków, Lesser Poland Voivodeship, Poland (On-Site)
1 Week ago
The Walt Disney Company - Senior Systems Engineer

The Walt Disney Company

New York, New York, United States (On-Site)
4 Months ago
Google - Data Cloud Consultant

Google

Bengaluru, Karnataka, India (On-Site)
1 Week ago
Playtech - DevOps Engineer

Playtech

Vienna, Vienna, Austria (On-Site)
1 Week ago
Google - Systems Development Engineer, Google Distributed Cloud

Google

Sunnyvale, California, United States (On-Site)
1 Week ago

Get notifed when new similar jobs are uploaded