Senior Site Reliability Engineer

8 Months ago • 5 Years + • DevOps

Job Summary

Job Description

Senior Site Reliability Engineer with 5+ years of experience in Cloud and on-prem SRE design and implementation. Must have expertise in infrastructure automation, distributed systems, and cloud platforms like AWS, Azure, GCP. Strong knowledge of monitoring, logging, and configuration management is essential.
Must have:
  • Infrastructure Automation
  • Distributed Systems
  • Cloud Platforms
  • Monitoring Concepts
Good to have:
  • Containerization Tech
  • Network Experience
  • Elastic Search
  • Prometheus
Perks:
  • Global IT Team
  • Fast-Paced Environment

Job Details

Responsibilities:

About Tencent Overseas IT:
Tencent Overseas IT has the mission to empower Tencent’s rapid global growth with future-ready, global IT platforms, applications, and services. We are chartered to lead the Overseas IT strategy, architecture, roadmap, and execution. Satisfying our internal/external customers and becoming a world-class global IT team are our top aspirations.


We are seeking a Sr. Site Reliability Engineer with extensive cloud and on-prem SRE design and implementation experience.

Duties and Responsibilities:
This senior role will closely work with our internal IT and cloud providers to design the best global SRE architecture and solution in the cloud. This role will also support the studio’s infrastructure, game publishing infrastructure and its evolution to the cloud. Our customers include internal or acquired gaming studios, game publishing services, innovative offices/workplaces, various business groups, and external customers. The work scope will include understanding the internal customers’ business requirements, collecting the technical requirements, developing reference architecture and prototypes based on leading industry best practices, leading implementation, and deployment for global locations, as well as issue troubleshooting when necessary.

For this SRE job, you will:
• Design, implement, and support operational and reliability of large-scale Cloud-enabled studio with a focus on performance at scale, real-time monitoring, logging ,analyzing and alerting
• Maintain services once they go live by measuring and monitoring availability, latency, and overall system health.
• Design and develop robust and scalable products and tools to enhance operational efficiency.
• Scale systems sustainably through mechanisms like automation and evolve systems by pushing for changes that improve reliability and velocity.
• Participate in incident response and troubleshooting efforts to minimize downtime and ensure system reliability.
• Maintain project and product documents and knowledge
• Be part of an on-call rotation to support production systems (if needed)


Based in Shanghai, China, this person will work closely with the global IT team, and HQ teams.

Whom we are looking for:

  • A quick learner
  • A positive, self-motivated, and passionate person
  • Independent, insistent, and open-minded.
  • A great team player and both dependable and autonomous.
  • Customer-oriented and could work at a very fast pace.

Requirements:

Requirements

  • 5+ years of experience with Infrastructure automation, distributed systems design, experience with design, develop tools for running large-scale private or public cloud systems in Production
  • In-depth knowledge and understanding of monitoring concepts, alert mechanisms, log monitoring, anomaly detections, creation, and setup of dashboards.
  • In-depth knowledge and experience with Elastic Search, Prometheus
  • Expertise in configuration management with a framework such as Ansible, Terraform, Helm
  • Proficiency with programming languages like Python, Golang, and shell scripting to automate tasks
  • Passion for infrastructure and monitoring as code
  • Bachelor’s degree (or higher), Computer Science, Mathematics, or related science or engineering major
  • Solid understanding of cloud platforms (e.g., AWS, Azure, GCP) and containerization technologies (e.g., Docker, Kubernetes).
  • Good understanding and hands on experience in network is plus
  • Bilingual preferred (English, Chinese)

Similar Jobs

Canonical - Desktop and Embedded Linux Field Engineer

Canonical

Taipei City, Taiwan (On-Site)
2 Weeks ago
Social Discovery Ventures - Senior DevOps

Social Discovery Ventures

Batumi, Adjara, Georgia (Remote)
3 Weeks ago
AI Dash - Software Development Engineer - II DevOps

AI Dash

Bengaluru, Karnataka, India (Hybrid)
6 Months ago
Alation - Sr. Technical Support Engineer

Alation

Chile (Remote)
1 Month ago
Marsh McLennan - Applications Development Analyst

Marsh McLennan

Gurugram, India (Hybrid)
8 Hours ago
Dream Sports - SDE 2 - DevOps

Dream Sports

Mumbai, Maharashtra, India (On-Site)
1 Month ago
Microsoft - Technical Support Engineer - Security & Compliance

Microsoft

(On-Site)
1 Month ago
Aera Technology - Senior Platform Administration Engineer

Aera Technology

Bucharest, Bucharest, Romania (Hybrid)
7 Months ago
Rockstar Games - DevOps Engineer

Rockstar Games

Edinburgh, Scotland, United Kingdom (On-Site)
1 Year ago
PlayStation Global - Senior Machine Learning Engineer

PlayStation Global

London, England, United Kingdom (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Epic Games - Senior DevOps Programmer

Epic Games

Porto Alegre, State Of Rio Grande Do Sul, Brazil (On-Site)
2 Months ago
Smilegate - Build Manager [LOST ARK Mobile]

Smilegate

Seongnam-si, Gyeonggi-do, South Korea (On-Site)
4 Months ago
bytedance - Senior Machine Learning Ops Engineer, ML System - Foundation Model

bytedance

Seattle, Washington, United States (On-Site)
4 Months ago
bytedance - Software Engineer Intern (Doubao (Seed) - Machine Learning System) - 2025 Summer (MS)

bytedance

Seattle, Washington, United States (On-Site)
7 Months ago
GreenWave™ Radios - Tech Lead, Design Verification

GreenWave™ Radios

Bengaluru, Karnataka, India (On-Site)
8 Months ago
Wind River Systems - Member of Technical Staff

Wind River Systems

Bengaluru, Karnataka, India (On-Site)
7 Months ago
Qualcomm - Senior/Staff Sensor Design Verification Engineer

Qualcomm

Cork, County Cork, Ireland (On-Site)
4 Days ago
NVIDIA - Senior Software QA Test Development Engineer

NVIDIA

Shanghai, Shanghai, China (On-Site)
4 Months ago
Nielsen Holdings - Software Engineer - Bigdata (Java/ Scala/ Python ,SQL , AWS)

Nielsen Holdings

Bengaluru, Karnataka, India (Hybrid)
7 Months ago
Alphawave Semi - Senior Staff Engineer - IP Design

Alphawave Semi

Bengaluru, Karnataka, India (On-Site)
2 Weeks ago

Get notifed when new similar jobs are uploaded

Jobs in Shanghai, Shanghai, China

Informa Group - Senior Analyst, Connected Life

Informa Group

Shanghai, China (On-Site)
1 Month ago
Paper Stacking games - Warehouse Logistics Operation (Service Provider Operation Direction)

Paper Stacking games

Shanghai, China (On-Site)
4 Days ago
Paper Stacking games - Backend Development Intern

Paper Stacking games

Shanghai, China (On-Site)
6 Hours ago
Aptive - RTR Team Leader

Aptive

Suzhou, Jiangsu, China (On-Site)
1 Month ago
Aptive - Warehouse Operator

Aptive

Shanghai, China (On-Site)
2 Weeks ago
Ubisoft - (Senior)3D Character Artist[Art Team]

Ubisoft

Shanghai, Shanghai, China (On-Site)
9 Months ago
Riot Games - Principal Game Producer

Riot Games

Shanghai, Shanghai, China (On-Site)
1 Month ago
sony global (Games) - Robotics Researcher

sony global (Games)

Shenzhen, Guangdong Province, China (On-Site)
1 Month ago
Marsh McLennan - Consultant

Marsh McLennan

Shanghai, China (Hybrid)
1 Week ago
Ubisoft - Animator

Ubisoft

Shanghai, Shanghai, China (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

DevOps Jobs

Ion - Senior DevSecOps Engineer, Italy

Ion

London, England, United Kingdom (On-Site)
7 Months ago
Nagarro - Senior Engineer, DevOps

Nagarro

Bengaluru, Karnataka, India (On-Site)
7 Months ago
Rackspace Technology - Python Software Engineer IV

Rackspace Technology

India (Remote)
2 Months ago
Nagarro - Senior Staff Engineer

Nagarro

Philippines (Remote)
7 Months ago
Britive - ENGINEERING MANAGER

Britive

Bengaluru, Karnataka, India (Remote)
6 Months ago
Ion - Cloud Engineer/Architect (DevOps)

Ion

Italy (On-Site)
7 Months ago
PwC - AWS DataOps Engineer

PwC

Bengaluru, Karnataka, India (On-Site)
8 Months ago
UXBERT Labs - Senior Solution Architect

UXBERT Labs

Riyadh, Riyadh Province, Saudi Arabia (Hybrid)
2 Months ago
bytedance - Cloud Site Reliability Engineer

bytedance

Seattle, Washington, United States (On-Site)
1 Month ago
CloudLinux - Senior Python Developer for KernelCare

CloudLinux

Tbilisi, Tbilisi, Georgia (Remote)
2 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Tencent is a world-leading internet and technology company that develops innovative products and services to improve the quality of life for people around the world.Read MoreEqual Employment Opportunity at TencentAs an equal opportunity employer, we firmly believe that diverse voices fuel our innovation and allow us to better serve our users and the community. We foster an environment where every employee of Tencent feels supported and inspired to achieve individual and common goals.Read More

London, England, United Kingdom (On-Site)

Los Angeles, California, United States (On-Site)

Shenzhen, Guangdong Province, China (On-Site)

Shenzhen, Guangdong Province, China (On-Site)

London, England, United Kingdom (On-Site)

Shenzhen, Guangdong Province, China (On-Site)

Palo Alto, California, United States (On-Site)

Irvine, California, United States (On-Site)

London, England, United Kingdom (On-Site)

View All Jobs

Get notified when new jobs are added by Tencent

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug