Senior Site Reliability Engineer

6 Months ago • 5 Years + • DevOps

Job Summary

Job Description

Senior Site Reliability Engineer with 5+ years of experience in Cloud and on-prem SRE design and implementation. Must have expertise in infrastructure automation, distributed systems, and cloud platforms like AWS, Azure, GCP. Strong knowledge of monitoring, logging, and configuration management is essential.
Must have:
  • Infrastructure Automation
  • Distributed Systems
  • Cloud Platforms
  • Monitoring Concepts
Good to have:
  • Containerization Tech
  • Network Experience
  • Elastic Search
  • Prometheus
Perks:
  • Global IT Team
  • Fast-Paced Environment

Job Details

Responsibilities:

About Tencent Overseas IT:
Tencent Overseas IT has the mission to empower Tencent’s rapid global growth with future-ready, global IT platforms, applications, and services. We are chartered to lead the Overseas IT strategy, architecture, roadmap, and execution. Satisfying our internal/external customers and becoming a world-class global IT team are our top aspirations.


We are seeking a Sr. Site Reliability Engineer with extensive cloud and on-prem SRE design and implementation experience.

Duties and Responsibilities:
This senior role will closely work with our internal IT and cloud providers to design the best global SRE architecture and solution in the cloud. This role will also support the studio’s infrastructure, game publishing infrastructure and its evolution to the cloud. Our customers include internal or acquired gaming studios, game publishing services, innovative offices/workplaces, various business groups, and external customers. The work scope will include understanding the internal customers’ business requirements, collecting the technical requirements, developing reference architecture and prototypes based on leading industry best practices, leading implementation, and deployment for global locations, as well as issue troubleshooting when necessary.

For this SRE job, you will:
• Design, implement, and support operational and reliability of large-scale Cloud-enabled studio with a focus on performance at scale, real-time monitoring, logging ,analyzing and alerting
• Maintain services once they go live by measuring and monitoring availability, latency, and overall system health.
• Design and develop robust and scalable products and tools to enhance operational efficiency.
• Scale systems sustainably through mechanisms like automation and evolve systems by pushing for changes that improve reliability and velocity.
• Participate in incident response and troubleshooting efforts to minimize downtime and ensure system reliability.
• Maintain project and product documents and knowledge
• Be part of an on-call rotation to support production systems (if needed)


Based in Shanghai, China, this person will work closely with the global IT team, and HQ teams.

Whom we are looking for:

  • A quick learner
  • A positive, self-motivated, and passionate person
  • Independent, insistent, and open-minded.
  • A great team player and both dependable and autonomous.
  • Customer-oriented and could work at a very fast pace.

Requirements:

Requirements

  • 5+ years of experience with Infrastructure automation, distributed systems design, experience with design, develop tools for running large-scale private or public cloud systems in Production
  • In-depth knowledge and understanding of monitoring concepts, alert mechanisms, log monitoring, anomaly detections, creation, and setup of dashboards.
  • In-depth knowledge and experience with Elastic Search, Prometheus
  • Expertise in configuration management with a framework such as Ansible, Terraform, Helm
  • Proficiency with programming languages like Python, Golang, and shell scripting to automate tasks
  • Passion for infrastructure and monitoring as code
  • Bachelor’s degree (or higher), Computer Science, Mathematics, or related science or engineering major
  • Solid understanding of cloud platforms (e.g., AWS, Azure, GCP) and containerization technologies (e.g., Docker, Kubernetes).
  • Good understanding and hands on experience in network is plus
  • Bilingual preferred (English, Chinese)

Similar Jobs

Next Level Business Services - Full Stack Developer

Next Level Business Services

Jersey City, New Jersey, United States (On-Site)
5 Months ago
ByteDance - Software Engineer, SRE - Platform Services

ByteDance

San Jose, California, United States (On-Site)
1 Month ago
CleverTap - Senior Backend Engineer - Platform

CleverTap

Mumbai, Maharashtra, India (Hybrid)
5 Months ago
ByteDance - Software Engineer, SRE - Platform Services

ByteDance

Seattle, Washington, United States (On-Site)
6 Days ago
EXUSIA - Senior Data Engineer

EXUSIA

Hyderabad, Telangana, India (Remote)
1 Month ago
Twitch - Software Development Engineer

Twitch

San Francisco, California, United States (On-Site)
1 Month ago
Trend Micro - (Sr.) Cloud Developer (Vision One)

Trend Micro

Taipei City, Taiwan (On-Site)
6 Months ago
N-iX - Senior DevOps Engineer

N-iX

India (Remote)
1 Month ago
ION - Cloud Engineer Kubernetes

ION

Italy (Hybrid)
5 Months ago
Luxoft - Siebel L2 Support Consultant

Luxoft

New Delhi, Delhi, India (Remote)
4 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

ByteDance - Senior Site Reliability Engineer, ML System - Foundation Model

ByteDance

Seattle, Washington, United States (On-Site)
2 Months ago
Visa - Staff Systems Engineer - DevEx

Visa

Singapore, Singapore (On-Site)
5 Months ago
Interactive Brokers - Technical Operations Specialist (TOPS)

Interactive Brokers

Greenwich, Connecticut, United States (Hybrid)
5 Months ago
Interactive Brokers - Senior Python Developer – Compliance Technology

Interactive Brokers

Mumbai, Maharashtra, India (Hybrid)
5 Months ago
Hive Innovative Group - Senior PHP developer

Hive Innovative Group

Cairo, Cairo Governorate, Egypt (On-Site)
8 Months ago
ByteDance - Site Reliability Engineer, Traffic Platform - 2025 Start

ByteDance

Singapore (On-Site)
5 Months ago
Samsung Semiconductor - Staff DevOps Engineer

Samsung Semiconductor

San Jose, California, United States (Hybrid)
2 Months ago
Rambus - SMTS Verification Engineering

Rambus

Bengaluru, Karnataka, India (Hybrid)
6 Months ago
ByteDance - Principal Site Reliability Engineer, CDN

ByteDance

Singapore (On-Site)
5 Months ago
ByteDance - Senior Security System Engineer

ByteDance

Singapore (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Shanghai, Shanghai, China

Animoca Brands - Frontend Developer

Animoca Brands

China (Remote)
5 Months ago
Tencent - Senior Game Designer

Tencent

Shanghai, Shanghai, China (On-Site)
3 Months ago
NVIDIA - Deep Learning Solution Architect

NVIDIA

Shanghai, Shanghai, China (On-Site)
2 Months ago
Tencent - Project Management - Level Infinite Pass

Tencent

Shenzhen, Guangdong Province, China (On-Site)
1 Week ago
Zengame Technology - Mobile Client Development Engineer

Zengame Technology

Shenzhen, Guangdong Province, China (On-Site)
1 Week ago
Wargaming - Level Artist

Wargaming

Shanghai, Shanghai, China (On-Site)
1 Month ago
NVIDIA - Senior Thermal Design Engineer

NVIDIA

Shanghai, Shanghai, China (On-Site)
3 Weeks ago
Riot Games - Senior Business Operation Manager

Riot Games

Shanghai, Shanghai, China (On-Site)
7 Months ago
NVIDIA - HR Business Partner

NVIDIA

Shanghai, Shanghai, China (On-Site)
2 Months ago
Zengame Technology - Advertisement Video Designer

Zengame Technology

Shenzhen, Guangdong Province, China (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded

DevOps Jobs

Velotio Technologies - Software Engineer (Data Engineering)

Velotio Technologies

Maharashtra, India (Remote)
2 Weeks ago
PlayStation Global - Sr. Software Engineer - ML/AI DevOps

PlayStation Global

San Francisco, California, United States (On-Site)
6 Days ago
Saviynt - Senior Principal Software Engineer - Privileged Access Management (PAM)

Saviynt

El Segundo, California, United States (Hybrid)
5 Months ago
Buckman - Senior Lead Digital Innovation Engineer - Solution Architect

Buckman

Chennai, Tamil Nadu, India (On-Site)
5 Months ago
Rackspace Technology - Data Architect

Rackspace Technology

Vietnam (Remote)
6 Days ago
Velotio Technologies - Infrastructure Architect

Velotio Technologies

Maharashtra, India (Remote)
2 Weeks ago
Tencent - Senior Cloud Solution Architect

Tencent

California, United States (On-Site)
1 Week ago
EXUSIA - Ab Initio Data Engineer

EXUSIA

United States (On-Site)
5 Months ago
Ubisoft - Linux DevOps Systems Administrator

Ubisoft

Montreal, Quebec, Canada (On-Site)
1 Month ago
SmileGate - Head of IT Infrastructure/Service Operations

SmileGate

Seongnam-si, Gyeonggi-do, South Korea (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

About The Company

Tencent is a world-leading internet and technology company that develops innovative products and services to improve the quality of life of people around the world.


Founded in 1998 with its headquarters in Shenzhen, China, Tencent's guiding principle is to use technology for good. Our communication and social services connect more than one billion people around the world, helping them to keep in touch with friends and family, access transportation, pay for daily necessities, and even be entertained.


Tencent also publishes some of the world's most popular video games and other high-quality digital content, enriching interactive entertainment experiences for people around the globe.


Tencent also offers a range of services such as cloud computing, advertising, FinTech, and other enterprise services to support our clients' digital transformation and business growth.


Tencent has been listed on the Stock Exchange of Hong Kong since 2004.

Tokyo, Japan (On-Site)

Tokyo, Japan (On-Site)

Osaka, Osaka, Japan (On-Site)

Osaka, Osaka, Japan (On-Site)

Amsterdam, North Holland, Netherlands (On-Site)

Shenzhen, Guangdong Province, China (On-Site)

Shenzhen, Guangdong Province, China (On-Site)

View All Jobs

Get notified when new jobs are added by Tencent

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug