Senior Site Reliability Engineer

3 Months ago • 8 Years + • DevOps

Job Summary

Job Description

Rackspace is seeking a Senior Site Reliability Engineer to join their ObjectRocket team. This role involves designing, implementing, and supporting the complex architecture of hardware, software, and networking systems for a large-scale distributed enterprise system. Responsibilities include leading a global SRE team, ensuring high reliability, driving improvements through automation, responding to and mitigating incidents, and identifying patterns in problems. The ideal candidate will possess strong experience in systems administration (Unix/Linux), scripting (Perl, Python, Bash), automation tools (Ansible, Chef, Salt), and networking. They will also have experience with operational monitoring and management tools and a systematic problem-solving approach.
Must have:
  • 8+ years experience
  • Unix/Linux expertise
  • Scripting (Perl, Python, Bash)
  • Automation (Ansible, Chef, Salt)
  • Networking knowledge
  • Large-scale system experience
  • Monitoring tools expertise
  • Problem-solving skills
Good to have:
  • Object Storage experience (Petabyte scale)
  • Relational database experience (MySQL)
  • NoSQL database experience (Redis, Mongo)
  • Cloud provider experience (AWS, GCP, Azure)
  • Docker and container management experience

Job Details

We are expanding our team of motivated engineers with a proven track record of delivering a best in class DBaSS platform – ObjectRocket. You will have the opportunity to work with a strong team of engineers working on large-scale distributed enterprise systems. You will design, implement and support complex architectural design of hardware, software and networking systems.

Lead global SRE team to provide the highest-level reliability to our customers and platform. You will drive improvement through automation and best practices. his includes responding to, mitigating, investigating, and escalating incidents when they occur. You will be responsible for stepping above the day-to-day support, for synthesizing patterns of problems and business needs to the engineering teams. You will be responsible for ensuring that your services operations over time are improving to enhance our business effectiveness.

Key Responsibilities:

    • Ensure completeness of the technical infrastructure to support system performance
    • Stay up to date with emerging technologies and trends in the enterprise hardware, infrastructure and networking industry
    • Partner with the application engineering team to ensure the stability and performance of our technology solutions
    • Continuous identification of problems in the technology stack and processes and their corresponding burndown
    • Follow and execute Rackspace change management processes
    • Participate in systems/code reviews and design sessions
    • Contribute to and organize central store of knowledge
    • Take full ownership of product life cycle
    • Participate in on-call rotation

Qualifications:

    • Bachelor’s degree in Computer Science or equivalent experience
    • 8+ years of information systems design/architecture/development
    • Strong experience in one or more of: Perl, Python, or Bash
    • Strong experience in one or more of: Ansible, Chef, or Salt
    • Strong experience working with Unix/Linux systems from kernel to shell and beyond, with experience working with system libraries, file systems, and client-server protocols. Networking: e.g. TCP/IP, UDP, ICMP, etc., MAC addresses, IP packets, DNS, SDN, OSI layers, and load balancing.
    • Experience in designing, analyzing and troubleshooting large-scale distributed systems.
    • Intermediate knowledge of operating systems.
    • Familiarity with algorithms, data structures and complexity analysis.
    • Intermediate experience designing complex SaaS applications for cloud reliability and scalability.
    • Intermediate experience with cloud infrastructure automation and CI/CD pipeline design.
    • Expertise in operational monitoring and management tools (Sensu, Prometheus, Grafana, etc.).
    • Intermediate written & verbal communication skills, both highly technical and non-technical.
    • Ability to work closely with non-technical stakeholders and executives.
    • Systematic problem-solving approach coupled with a strong sense of ownership and drive.
    • RHCE Preferred.
    • Preferred:
    • Experience working with Object Storage systems at Petabyte scale.
    • Experience using and managing one or more relational databases (e.g. MySQL).
    • Experience with non-relational databases (preferably Redis, Mongo)
    • Experience with cloud service providers (AWS, GCP, Azure, etc.)
    • Experience with Docker and container management systems (Swarm, Kubernetes, OpenShift, etc.)

#LI-JR1
#LI-Remote
#LI-USA
#rackspace

Similar Jobs

Corsair - Junior IT Support Engineer - Fixed Term Contract

Corsair

Wokingham, England, United Kingdom (On-Site)
6 Days ago
ByteDance - Software Engineer, ML System Architecture

ByteDance

Seattle, Washington, United States (On-Site)
3 Months ago
Google - Senior Software Engineering Manager, Google Cloud

Google

Bengaluru, Karnataka, India (On-Site)
3 Months ago
Google - Android Software Engineer, Waze

Google

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)
1 Month ago
The Walt Disney Company - Entertainment Technician - Full Time

The Walt Disney Company

Hong Kong (On-Site)
2 Months ago
Razer - Lead Site Reliability Engineer

Razer

Shanghai, Shanghai, China (On-Site)
4 Months ago
NVIDIA - Senior DevOps Engineer, Deep Learning Frameworks

NVIDIA

Warsaw, Masovian Voivodeship, Poland (On-Site)
1 Month ago
Microsoft - Cambridge Internship in ML Model Optimization

Microsoft

Cambridge, England, United Kingdom (On-Site)
4 Weeks ago
PENN Interactive - Engineering Manager, ML Platform

PENN Interactive

Philadelphia, Pennsylvania, United States (Hybrid)
2 Weeks ago
CCP Games - Infrastructure Engineer

CCP Games

Reykjavík, Reykjavíkurborg, Iceland (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

ByteDance - Solution Architect (Edge Cloud)

ByteDance

Singapore (On-Site)
2 Weeks ago
ByteDance - Software Engineer Large Model System Graduate (Machine Learning Sys-US) - 2024 Start (BS/MS)

ByteDance

Seattle, Washington, United States (On-Site)
3 Months ago
NVIDIA - Senior Formal Verification Engineer

NVIDIA

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)
1 Month ago
STAGE - Creative Content Manager - Movies

STAGE

Noida, Uttar Pradesh, India (On-Site)
4 Months ago
ByteDance - Software Development Engineer Graduate (Network Monitoring & Alerts) - 2025 Start (PhD)

ByteDance

San Jose, California, United States (On-Site)
3 Months ago
Google - Software Engineering Manager, People with Disabilities

Google

State Of Minas Gerais, Brazil (On-Site)
1 Month ago
ByteDance - Technical Program Manager (DCEO), Infrastructure Engineering

ByteDance

Singapore (On-Site)
3 Months ago
Playdead - DevOps Engineer

Playdead

Copenhagen, Denmark (On-Site)
4 Months ago
CloudHire - Anaplan Sales (Base + Commision)

CloudHire

Dallas, Texas, United States (On-Site)
4 Months ago

Get notifed when new similar jobs are uploaded

Jobs in United States

Patel greene - STEP Intern

Patel greene

Temple Terrace, Florida, United States (On-Site)
3 Months ago
Epic Games - Director, External Development

Epic Games

Cary, North Carolina, United States (On-Site)
1 Month ago
PlayStation Global - Director, Information Technology-Studio IT

PlayStation Global

Bellevue, Washington, United States (On-Site)
1 Week ago
Trek - Seasonal Sales Associate

Trek

New York, New York, United States (On-Site)
4 Days ago
Anavation - Software Developer 4

Anavation

Chantilly, Virginia, United States (On-Site)
4 Months ago
Fluence - Sr. Manager People Platforms

Fluence

Alpharetta, Georgia, United States (Hybrid)
4 Months ago
Netflix - Support Solutions Engineer (L5) Data Platform, Kafka

Netflix

United States (Remote)
4 Days ago
NVIDIA - Senior Hardware SoC Architect

NVIDIA

Santa Clara, California, United States (On-Site)
1 Month ago
Paypal - Senior Machine Learning Engineer

Paypal

San Jose, California, United States (Hybrid)
3 Months ago
Nagarro - Associate Staff Engineer, Machine Learning

Nagarro

New York, New York, United States (On-Site)
4 Months ago

Get notifed when new similar jobs are uploaded

DevOps Jobs

bosh group india - Technical Consultant

bosh group india

Bengaluru, Karnataka, India (On-Site)
2 Months ago
Avalanche Studios Group - Senior Cloud Engineer

Avalanche Studios Group

Stockholm, Stockholm County, Sweden (Hybrid)
1 Month ago
The Walt Disney Company - Manager, Systems Reliability Engineering

The Walt Disney Company

Burbank, California, United States (On-Site)
1 Month ago
Revolgy - GCP Engineer

Revolgy

Prague, Czechia (Hybrid)
1 Month ago
Visa - Staff Systems Engineer - Splunk Administrator - PRE

Visa

Austin, Texas, United States (Hybrid)
3 Months ago
Nintendo - CONTRACT - Sr Engineer (NTD)

Nintendo

Redmond, Washington, United States (On-Site)
2 Months ago
Netflix - Distributed Systems Engineer (L5) - Platform Engineering

Netflix

Los Gatos, California, United States (On-Site)
1 Month ago
Luxoft - Senior Software Support Engineer

Luxoft

Kuala Lumpur, Federal Territory Of Kuala Lumpur, Malaysia (Remote)
3 Months ago
Rackspace Technology - Google Cloud Engineer IV

Rackspace Technology

Canada (Remote)
1 Month ago
Omnissa - Senior Member of Technical Staff (C++ Windows Internals)

Omnissa

Bengaluru, Karnataka, India (On-Site)
3 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Gurugram, Haryana, India (Remote)

Mexico City, Mexico City, Mexico (Remote)

Mexico City, Mexico City, Mexico (Remote)

Mexico City, Mexico City, Mexico (Remote)

United States (Remote)

Gurugram, Haryana, India (Remote)

United States (Remote)

Vietnam (Remote)

View All Jobs

Get notified when new jobs are added by Rackspace Technology

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug