Senior Site Reliability Engineer

9 Months ago • 5 Years + • Devops

Job Summary

Job Description

Senior Site Reliability Engineer with 5+ years of experience in Cloud and on-prem SRE design and implementation. Must have expertise in infrastructure automation, distributed systems, and cloud platforms like AWS, Azure, GCP. Strong knowledge of monitoring, logging, and configuration management is essential.
Must have:
  • Infrastructure Automation
  • Distributed Systems
  • Cloud Platforms
  • Monitoring Concepts
Good to have:
  • Containerization Tech
  • Network Experience
  • Elastic Search
  • Prometheus
Perks:
  • Global IT Team
  • Fast-Paced Environment

Job Details

Responsibilities:

About Tencent Overseas IT:
Tencent Overseas IT has the mission to empower Tencent’s rapid global growth with future-ready, global IT platforms, applications, and services. We are chartered to lead the Overseas IT strategy, architecture, roadmap, and execution. Satisfying our internal/external customers and becoming a world-class global IT team are our top aspirations.


We are seeking a Sr. Site Reliability Engineer with extensive cloud and on-prem SRE design and implementation experience.

Duties and Responsibilities:
This senior role will closely work with our internal IT and cloud providers to design the best global SRE architecture and solution in the cloud. This role will also support the studio’s infrastructure, game publishing infrastructure and its evolution to the cloud. Our customers include internal or acquired gaming studios, game publishing services, innovative offices/workplaces, various business groups, and external customers. The work scope will include understanding the internal customers’ business requirements, collecting the technical requirements, developing reference architecture and prototypes based on leading industry best practices, leading implementation, and deployment for global locations, as well as issue troubleshooting when necessary.

For this SRE job, you will:
• Design, implement, and support operational and reliability of large-scale Cloud-enabled studio with a focus on performance at scale, real-time monitoring, logging ,analyzing and alerting
• Maintain services once they go live by measuring and monitoring availability, latency, and overall system health.
• Design and develop robust and scalable products and tools to enhance operational efficiency.
• Scale systems sustainably through mechanisms like automation and evolve systems by pushing for changes that improve reliability and velocity.
• Participate in incident response and troubleshooting efforts to minimize downtime and ensure system reliability.
• Maintain project and product documents and knowledge
• Be part of an on-call rotation to support production systems (if needed)


Based in Shanghai, China, this person will work closely with the global IT team, and HQ teams.

Whom we are looking for:

  • A quick learner
  • A positive, self-motivated, and passionate person
  • Independent, insistent, and open-minded.
  • A great team player and both dependable and autonomous.
  • Customer-oriented and could work at a very fast pace.

Requirements:

Requirements

  • 5+ years of experience with Infrastructure automation, distributed systems design, experience with design, develop tools for running large-scale private or public cloud systems in Production
  • In-depth knowledge and understanding of monitoring concepts, alert mechanisms, log monitoring, anomaly detections, creation, and setup of dashboards.
  • In-depth knowledge and experience with Elastic Search, Prometheus
  • Expertise in configuration management with a framework such as Ansible, Terraform, Helm
  • Proficiency with programming languages like Python, Golang, and shell scripting to automate tasks
  • Passion for infrastructure and monitoring as code
  • Bachelor’s degree (or higher), Computer Science, Mathematics, or related science or engineering major
  • Solid understanding of cloud platforms (e.g., AWS, Azure, GCP) and containerization technologies (e.g., Docker, Kubernetes).
  • Good understanding and hands on experience in network is plus
  • Bilingual preferred (English, Chinese)

Similar Jobs

Activision - Expert Software Engineer, Graphics

Activision

United States (Remote)
1 Month ago
Drive mode - QA Engineer

Drive mode

Tokyo, Japan (Hybrid)
3 Months ago
OKX - Data Engineer

OKX

Hong Kong (On-Site)
8 Months ago
BetterMe - Email Deliverability Specialist

BetterMe

Kyiv, Kyiv City, Ukraine (Remote)
4 Days ago
ISS Stoxx - Desktop Engineer (Audit and Policy Engineer)

ISS Stoxx

Makati City, Metro Manila, Philippines (Hybrid)
3 Days ago
Google - Software Developer III, Front End, Google Cloud AI

Google

Sunnyvale, California, United States (On-Site)
2 Months ago
Globalization Partners - Principal Software Engineer (full stack, Node.js, TypeScript, React.js, AWS)

Globalization Partners

Ireland (Remote)
1 Month ago
Rackspace Technology - Machine Learning Operations (MLOps) Architect - GCP

Rackspace Technology

United States (Remote)
1 Month ago
Riot Games - Staff Software Engineer, Web - Esports Platform & Experiences

Riot Games

Los Angeles, California, United States (On-Site)
2 Months ago
Anavation - Senior Cloud Developer

Anavation

Virginia, United States (Remote)
2 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Qualcomm - Sr Engineer - C++/Android Framework

Qualcomm

Hyderabad, Telangana, India (On-Site)
1 Month ago
Ziff Davis - Data Architect

Ziff Davis

Helsinki, Uusimaa, Finland (Hybrid)
1 Month ago
Bally's Interactive - Lead Broadcast Systems Engineer

Bally's Interactive

Chicago, Illinois, United States (On-Site)
1 Month ago
Ello - Product Engineer (ML & Mobile)

Ello

San Francisco, California, United States (On-Site)
3 Months ago
Scanline VFX - CFX TD

Scanline VFX

Seoul, South Korea (Hybrid)
3 Months ago
Rackspace Technology - Principal MLOps Engineer

Rackspace Technology

(Remote)
3 Months ago
nubank - Regulatory Compliance Senior Analyst

nubank

Mexico City, Mexico (On-Site)
1 Month ago
Reltio - Senior Staff Engineer

Reltio

Bengaluru, Karnataka, India (Hybrid)
4 Weeks ago
Kabam - Backend Software Engineer

Kabam

Vancouver, British Columbia, Canada (Hybrid)
2 Weeks ago
London stock Exchange - Application Technical Support Engineer (SRE Engineer)

London stock Exchange

Taipei City, Taiwan (Hybrid)
1 Month ago

Get notifed when new similar jobs are uploaded

Jobs in Shanghai, Shanghai, China

Google - Product Specialist, Large Customer Sales, gReach Program for People with Disabilities

Google

Guangdong Province, China (On-Site)
2 Months ago
Tencent - Senior Client-Side Security Engineer

Tencent

Shenzhen, Guangdong Province, China (On-Site)
1 Month ago
Tencent - Senior Staff Technical Artist

Tencent

Shenzhen, Guangdong Province, China (On-Site)
1 Month ago
Activision - Animator

Activision

Shanghai, Shanghai, China (On-Site)
2 Months ago
Haleon - New Retail Business Development Manager

Haleon

Beijing, China (On-Site)
12 Months ago
Google - Software Engineer, gReach Program for People with Disabilities

Google

Beijing, Beijing, China (On-Site)
2 Months ago
Tencent - WXG-HRBP

Tencent

Guangzhou, Guangdong Province, China (On-Site)
6 Months ago
WRI - Research Associate, Climate

WRI

Beijing, China (On-Site)
1 Month ago
supercell - Animator - Senior Level

supercell

Shanghai, China (On-Site)
2 Months ago
Nagarro - Principal Consultant, Support Presales

Nagarro

China (Remote)
8 Months ago

Get notifed when new similar jobs are uploaded

Devops Jobs

Zelis  - Senior Snowflake Platform Engineer

Zelis

Atlanta, Georgia, United States (Hybrid)
1 Month ago
Ion - Senior DevSecOps Engineer, Italy

Ion

Collecchio, Emilia-Romagna, Italy (On-Site)
8 Months ago
Xsolla - Software Architect

Xsolla

Los Angeles, California, United States (Hybrid)
1 Month ago
Sailpoint - Solution Architect

Sailpoint

United States (On-Site)
1 Month ago
Salesforce - Solution Architect - Tableau

Salesforce

Tokyo, Japan (Remote)
3 Months ago
Thales - Senior Technical Lead - DevOps

Thales

Bengaluru, Karnataka, India (Hybrid)
1 Month ago
Cadence - Lead Solutions Engineer

Cadence

Bengaluru, Karnataka, India (On-Site)
9 Months ago
Canva - Senior Platform Engineer - Workload Integration

Canva

Surry Hills, New South Wales, Australia (Remote)
4 Months ago
Next Level Business Services - Salesforce Solution Architect

Next Level Business Services

Diamond Bar, California, United States (On-Site)
8 Months ago
Apple - Compute SRE

Apple

Seattle, Washington, United States (On-Site)
4 Weeks ago

Get notifed when new similar jobs are uploaded

About The Company

Tencent is a world-leading internet and technology company that develops innovative products and services to improve the quality of life for people around the world.Read MoreEqual Employment Opportunity at TencentAs an equal opportunity employer, we firmly believe that diverse voices fuel our innovation and allow us to better serve our users and the community. We foster an environment where every employee of Tencent feels supported and inspired to achieve individual and common goals.Read More

Irvine, California, United States (On-Site)

Shenzhen, Guangdong Province, China (On-Site)

Shenzhen, Guangdong Province, China (On-Site)

Osaka, Osaka, Japan (On-Site)

Shenzhen, Guangdong Province, China (On-Site)

Shenzhen, Guangdong Province, China (On-Site)

Shenzhen, Guangdong Province, China (On-Site)

Shenzhen, Guangdong Province, China (On-Site)

London, England, United Kingdom (On-Site)

View All Jobs

Get notified when new jobs are added by Tencent

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug