Senior Site Reliability Engineer

10 Months ago • 5 Years + • Devops

Job Summary

Job Description

Senior Site Reliability Engineer with 5+ years of experience in Cloud and on-prem SRE design and implementation. Must have expertise in infrastructure automation, distributed systems, and cloud platforms like AWS, Azure, GCP. Strong knowledge of monitoring, logging, and configuration management is essential.
Must have:
  • Infrastructure Automation
  • Distributed Systems
  • Cloud Platforms
  • Monitoring Concepts
Good to have:
  • Containerization Tech
  • Network Experience
  • Elastic Search
  • Prometheus
Perks:
  • Global IT Team
  • Fast-Paced Environment

Job Details

Responsibilities:

About Tencent Overseas IT:
Tencent Overseas IT has the mission to empower Tencent’s rapid global growth with future-ready, global IT platforms, applications, and services. We are chartered to lead the Overseas IT strategy, architecture, roadmap, and execution. Satisfying our internal/external customers and becoming a world-class global IT team are our top aspirations.


We are seeking a Sr. Site Reliability Engineer with extensive cloud and on-prem SRE design and implementation experience.

Duties and Responsibilities:
This senior role will closely work with our internal IT and cloud providers to design the best global SRE architecture and solution in the cloud. This role will also support the studio’s infrastructure, game publishing infrastructure and its evolution to the cloud. Our customers include internal or acquired gaming studios, game publishing services, innovative offices/workplaces, various business groups, and external customers. The work scope will include understanding the internal customers’ business requirements, collecting the technical requirements, developing reference architecture and prototypes based on leading industry best practices, leading implementation, and deployment for global locations, as well as issue troubleshooting when necessary.

For this SRE job, you will:
• Design, implement, and support operational and reliability of large-scale Cloud-enabled studio with a focus on performance at scale, real-time monitoring, logging ,analyzing and alerting
• Maintain services once they go live by measuring and monitoring availability, latency, and overall system health.
• Design and develop robust and scalable products and tools to enhance operational efficiency.
• Scale systems sustainably through mechanisms like automation and evolve systems by pushing for changes that improve reliability and velocity.
• Participate in incident response and troubleshooting efforts to minimize downtime and ensure system reliability.
• Maintain project and product documents and knowledge
• Be part of an on-call rotation to support production systems (if needed)


Based in Shanghai, China, this person will work closely with the global IT team, and HQ teams.

Whom we are looking for:

  • A quick learner
  • A positive, self-motivated, and passionate person
  • Independent, insistent, and open-minded.
  • A great team player and both dependable and autonomous.
  • Customer-oriented and could work at a very fast pace.

Requirements:

Requirements

  • 5+ years of experience with Infrastructure automation, distributed systems design, experience with design, develop tools for running large-scale private or public cloud systems in Production
  • In-depth knowledge and understanding of monitoring concepts, alert mechanisms, log monitoring, anomaly detections, creation, and setup of dashboards.
  • In-depth knowledge and experience with Elastic Search, Prometheus
  • Expertise in configuration management with a framework such as Ansible, Terraform, Helm
  • Proficiency with programming languages like Python, Golang, and shell scripting to automate tasks
  • Passion for infrastructure and monitoring as code
  • Bachelor’s degree (or higher), Computer Science, Mathematics, or related science or engineering major
  • Solid understanding of cloud platforms (e.g., AWS, Azure, GCP) and containerization technologies (e.g., Docker, Kubernetes).
  • Good understanding and hands on experience in network is plus
  • Bilingual preferred (English, Chinese)

Similar Jobs

Roblox - Developer Engagement Team (Contract)

Roblox

United States (Remote)
1 Month ago
Qualcomm - Senior ASIC Modem Design Engineer

Qualcomm

Colombes, Île-de-France, France (On-Site)
2 Months ago
Applied materials  - IT Solutions Management

Applied materials

Bengaluru, Karnataka, India (On-Site)
1 Week ago
Palo Alto Networks - Principal Consultant, Offensive Security, Proactive Services (Unit 42)

Palo Alto Networks

Fort Meade, Maryland, United States (On-Site)
2 Months ago
Forcepoint - DevOps Engineer - III

Forcepoint

Mumbai, Maharashtra, India (On-Site)
1 Week ago
Cognite - Solution Architect

Cognite

Tokyo, Japan (On-Site)
10 Months ago
Perplexity - AI Training Infrastructure Engineer - Post Training

Perplexity

San Francisco, California, United States (On-Site)
2 Months ago
luxsoft - Senior/Lead DevOps Engineer

luxsoft

Noida, Uttar Pradesh, India (On-Site)
1 Month ago
bytedance - Senior Site Reliability Engineer, ML System

bytedance

San Jose, California, United States (On-Site)
9 Months ago
Aristocrat - DevOps Lead

Aristocrat

Austin, Texas, United States (Hybrid)
2 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

NVIDIA - Senior Test Engineer

NVIDIA

(Remote)
5 Months ago
Apple - Software Engineer, Audio/Music Engineering

Apple

Cupertino, California, United States (On-Site)
2 Weeks ago
Nium - DevOps Engineer II

Nium

Malta (Hybrid)
2 Months ago
JMA - Senior Embedded Systems Security Engineer

JMA

Plano, Texas, United States (On-Site)
8 Months ago
Palo Alto Networks - Senior Technical Support Engineer, SASE

Palo Alto Networks

Chiyoda City, Tokyo, Japan (On-Site)
3 Weeks ago
Rolls-Royce - Sales Manager - Gas System

Rolls-Royce

Shanghai, China (On-Site)
2 Months ago
N-ix - Senior Data Engineer with Snowflake

N-ix

(On-Site)
1 Month ago
zeta - Lead Site Reliability Engineer

zeta

Bengaluru, Karnataka, India (On-Site)
9 Months ago
Visa - Sr. Manager - Site Reliability Engineer

Visa

Ashburn, Virginia, United States (Hybrid)
2 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Shanghai, Shanghai, China

Zengame Technology - Ad Optimizer

Zengame Technology

Shenzhen, Guangdong Province, China (On-Site)
1 Week ago
Finger Tango - Game Product Operations (Data Analysis Focused)

Finger Tango

Guangzhou, Guangdong Province, China (On-Site)
1 Year ago
Razer - Senior Data Sourcing Specialist

Razer

Chengdu, Sichuan, China (On-Site)
2 Weeks ago
Qingci Games - Senior Unity3D Developer

Qingci Games

Xiamen, Fujian, China (On-Site)
3 Months ago
Tencent - Domestic PC Game Publishing Marketing Manager

Tencent

Shenzhen, Guangdong Province, China (On-Site)
5 Months ago
hogarth - Account Manager

hogarth

Shanghai, China (On-Site)
3 Months ago
Riot Games - Senior Manager, Content Production

Riot Games

Shanghai, China (On-Site)
3 Months ago
Lilith games - Social Media Marketing Manager (Kuaishou Platform)

Lilith games

Shanghai, China (On-Site)
1 Week ago
hogarth - Project Executive

hogarth

Shanghai, China (On-Site)
2 Months ago
Zengame Technology - VIP User Operator

Zengame Technology

Shenzhen, Guangdong Province, China (On-Site)
6 Months ago

Get notifed when new similar jobs are uploaded

Devops Jobs

bytedance - Senior Software Engineer, Backend and Infrastructure

bytedance

San Jose, California, United States (On-Site)
9 Months ago
AeroSpike - Senior DevOps Engineer, Cloud

AeroSpike

United States (Remote)
2 Months ago
zeta - Site Reliability Engineer I / II

zeta

Hyderabad, Telangana, India (On-Site)
1 Year ago
DraftKings - Senior Site Reliability Engineer - FinOps

DraftKings

Canada (Remote)
3 Months ago
bytedance - AI and Cloud Solution Architect

bytedance

Singapore (On-Site)
3 Months ago
GoTo Group - Sr. Software Engineer (iOS) - Engineering Platform (2)

GoTo Group

Bengaluru, Karnataka, India (On-Site)
9 Months ago
nextgen-clearing - DevOps Engineer

nextgen-clearing

Mumbai, Maharashtra, India (On-Site)
1 Month ago
Next Level Business Services - Salesforce Marketing cloud Developer

Next Level Business Services

Boston, Massachusetts, United States (On-Site)
9 Months ago
Google - Software Engineer III, Infrastructure, Google Cloud Platforms

Google

Kirkland, Washington, United States (On-Site)
9 Months ago
Wind River - Senior Engineer - Cloud

Wind River

Bengaluru, Karnataka, India (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Tencent is a world-leading internet and technology company that develops innovative products and services to improve the quality of life for people around the world.Read MoreEqual Employment Opportunity at TencentAs an equal opportunity employer, we firmly believe that diverse voices fuel our innovation and allow us to better serve our users and the community. We foster an environment where every employee of Tencent feels supported and inspired to achieve individual and common goals.Read More

Tokyo, Japan (On-Site)

Paris, Île-de-France, France (On-Site)

Shenzhen, Guangdong Province, China (On-Site)

Shenzhen, Guangdong Province, China (On-Site)

Auckland, Auckland, New Zealand (On-Site)

Shenzhen, Guangdong Province, China (On-Site)

View All Jobs

Get notified when new jobs are added by Tencent

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug