Senior/Expert Engineer, Site Reliability Engineering (SRE)

Garena

3+ Years | Singapore, Singapore (On Site) | Full Time | 1 months ago

Apply Now

Job Summary

The Senior/Expert Engineer, Site Reliability Engineering (SRE) will be responsible for ensuring product scalability, stability, and performance by deep diving into development lines and understanding application mechanisms. Key duties include setting up, managing, and maintaining applications, middleware, and big-data services, performing deployments, fine-tuning, and troubleshooting. The role also involves designing automation, capacity management, full-chain stress testing, and preparing operation documentation. Candidates should have a strong background in Linux, Kubernetes, networking, and programming with Bash, Python, or Go.

Must Have

Deep dive into development lines, learning and understanding the mechanism of every application component
Promote product scalability, stability and performance
Setup, manage and maintain product/middleware/big-data applications and services
Perform regular and ad-hoc server-side deployments, performance fine-tuning and troubleshooting
Design and develop automations for workflow
Capacity and Resource management
Responsible for the full-chain stress test to enhance the performance and remove redundancy of applications
Prepare routine operation documentation
Bachelor’s or higher degree in Computer Science, Engineering, Information Systems or related fields
Minimum 3 years of relevant full-time working experience in Site Reliability Engineer roles
Extensive and hands-on knowledge with Linux operating systems (Ubuntu, CentOS, etc.)
Extensive and hands-on knowledge with Kubernetes and the eco-system
Knowledge of Computer Network (TCP/IP, DNS, etc.) and OS
Hands-on experience with at least one of the programming languages: Bash, Python, Go
Strong analytical and problem-solving skills with the ability to thrive under high-pressure situations
Fast learning ability and a good team player
Detailed-oriented, cautious and prudent

Job Description

Deep dive into development lines, learning and understanding the mechanism of every application component, and promoting product scalability, stability and performance.
Setup, manage and maintain product/middleware/big-data applications and services.
Perform regular and ad-hoc server-side deployments, performance fine-tuning and troubleshooting.
Design and develop automations for our workflow.
Capacity and Resource management.
Responsible for the full-chain stress test to enhance the performance and remove redundancy of applications.
Prepare routine operation documentation.

Job Requirements

Bachelor’s or higher degree in Computer Science, Engineering, Information Systems or related fields.
Minimum 3 years of relevant full-time working experience in Site Reliability Engineer roles
Extensive and hands-on knowledge with Linux operating systems (Ubuntu, CentOS, etc.).
Extensive and hands-on knowledge with Kubernetes and the eco-system.
Knowledge of Computer Network(TCP/IP, DNS, etc.) and OS.
Hands-on experience with at least one of the programming languages: Bash, Python, Go.
Strong analytical and problem-solving skills with the ability to thrive under high-pressure situations.
Fast learning ability and a good team player.
Detailed-oriented, cautious and prudent.

8 Skills Required For This Role

Problem Solving Team Player Game Texts Dns Linux Kubernetes Python Bash

Similar Jobs

Senior/Expert Engineer, Site Reliability Engineering (SRE)

Job Summary

Must Have

Job Description

Job Description

Job Requirements

8 Skills Required For This Role

Similar Jobs

Devops

Software Development & Engineering