Senior Site Reliability Engineer

1 Month ago • 8 Years + • DevOps

About the job

Job Description

Rackspace is seeking a Senior Site Reliability Engineer to join their ObjectRocket team. This role involves designing, implementing, and supporting the complex architecture of hardware, software, and networking systems for a large-scale distributed enterprise system. Responsibilities include leading a global SRE team, ensuring high reliability, driving improvements through automation, responding to and mitigating incidents, and identifying patterns in problems. The ideal candidate will possess strong experience in systems administration (Unix/Linux), scripting (Perl, Python, Bash), automation tools (Ansible, Chef, Salt), and networking. They will also have experience with operational monitoring and management tools and a systematic problem-solving approach.
Must have:
  • 8+ years experience
  • Unix/Linux expertise
  • Scripting (Perl, Python, Bash)
  • Automation (Ansible, Chef, Salt)
  • Networking knowledge
  • Large-scale system experience
  • Monitoring tools expertise
  • Problem-solving skills
Good to have:
  • Object Storage experience (Petabyte scale)
  • Relational database experience (MySQL)
  • NoSQL database experience (Redis, Mongo)
  • Cloud provider experience (AWS, GCP, Azure)
  • Docker and container management experience
We are expanding our team of motivated engineers with a proven track record of delivering a best in class DBaSS platform – ObjectRocket. You will have the opportunity to work with a strong team of engineers working on large-scale distributed enterprise systems. You will design, implement and support complex architectural design of hardware, software and networking systems.

Lead global SRE team to provide the highest-level reliability to our customers and platform. You will drive improvement through automation and best practices. his includes responding to, mitigating, investigating, and escalating incidents when they occur. You will be responsible for stepping above the day-to-day support, for synthesizing patterns of problems and business needs to the engineering teams. You will be responsible for ensuring that your services operations over time are improving to enhance our business effectiveness.

Key Responsibilities:

    • Ensure completeness of the technical infrastructure to support system performance
    • Stay up to date with emerging technologies and trends in the enterprise hardware, infrastructure and networking industry
    • Partner with the application engineering team to ensure the stability and performance of our technology solutions
    • Continuous identification of problems in the technology stack and processes and their corresponding burndown
    • Follow and execute Rackspace change management processes
    • Participate in systems/code reviews and design sessions
    • Contribute to and organize central store of knowledge
    • Take full ownership of product life cycle
    • Participate in on-call rotation

Qualifications:

    • Bachelor’s degree in Computer Science or equivalent experience
    • 8+ years of information systems design/architecture/development
    • Strong experience in one or more of: Perl, Python, or Bash
    • Strong experience in one or more of: Ansible, Chef, or Salt
    • Strong experience working with Unix/Linux systems from kernel to shell and beyond, with experience working with system libraries, file systems, and client-server protocols. Networking: e.g. TCP/IP, UDP, ICMP, etc., MAC addresses, IP packets, DNS, SDN, OSI layers, and load balancing.
    • Experience in designing, analyzing and troubleshooting large-scale distributed systems.
    • Intermediate knowledge of operating systems.
    • Familiarity with algorithms, data structures and complexity analysis.
    • Intermediate experience designing complex SaaS applications for cloud reliability and scalability.
    • Intermediate experience with cloud infrastructure automation and CI/CD pipeline design.
    • Expertise in operational monitoring and management tools (Sensu, Prometheus, Grafana, etc.).
    • Intermediate written & verbal communication skills, both highly technical and non-technical.
    • Ability to work closely with non-technical stakeholders and executives.
    • Systematic problem-solving approach coupled with a strong sense of ownership and drive.
    • RHCE Preferred.
    • Preferred:
    • Experience working with Object Storage systems at Petabyte scale.
    • Experience using and managing one or more relational databases (e.g. MySQL).
    • Experience with non-relational databases (preferably Redis, Mongo)
    • Experience with cloud service providers (AWS, GCP, Azure, etc.)
    • Experience with Docker and container management systems (Swarm, Kubernetes, OpenShift, etc.)

#LI-JR1
#LI-Remote
#LI-USA
#rackspace
View Full Job Description

Add your resume

80%

Upload your resume, increase your shortlisting chances by 80%

About The Company

Gurugram, Haryana, India (Hybrid)

Gurugram, Haryana, India (Hybrid)

Gurugram, Haryana, India (Hybrid)

California, United States (Hybrid)

United States (Remote)

England, United Kingdom (Hybrid)

Giza, Giza Governorate, Egypt (Remote)

United States (Remote)

India (Remote)

View All Jobs

Get notified when new jobs are added by Rackspace Technology

Similar Jobs

The Gang - Client Director

The Gang, Malaysia (Remote)

LightSpeed Studios - Lead Systems Engineer

LightSpeed Studios, United States (On-Site)

Paytm - Team Lead- sales- Dehradun

Paytm, India (On-Site)

SSC Technologies - Senior Technical Consultant (Riyadh, KSA)

SSC Technologies, Saudi Arabia (On-Site)

Luxoft - Senior OpenEdge Engineer

Luxoft, Poland (On-Site)

Vimeo - Sr. Data Engineer

Vimeo, United States (Remote)

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Get notifed when new similar jobs are uploaded

Jobs in United States

Onward Search - Business Developer Associate

Onward Search, United States (On-Site)

Glean - Software Engineer, Machine Learning

Glean, United States (On-Site)

HP - Foreign Exchange Manager

HP, United States (On-Site)

Take-Two Interactive - Senior Coordinator, Events

Take-Two Interactive, United States (On-Site)

PlayStation Global - Sr. Manager, Software Engineering, Engineering Enablement

PlayStation Global, United States (On-Site)

BigID - Senior Services Operations Analyst

BigID, United States (Remote)

The Walt Disney Company - Editorial Manager, Features

The Walt Disney Company, United States (On-Site)

Take-Two Interactive - Data Archivist

Take-Two Interactive, United States (On-Site)

Regression Games - Software Engineer - Automation Frameworks

Regression Games, United States (Hybrid)

Get notifed when new similar jobs are uploaded

DevOps Jobs

Visa - Staff Systems Engineer - GO

Visa, Singapore (On-Site)

Toast - Staff Software Engineer

Toast, India (On-Site)

Nordcurrent - DevOps Engineer

Nordcurrent, Lithuania (On-Site)

Wildlife Studios - Senior Site Reliability Engineer

Wildlife Studios, Brazil (On-Site)

Microsoft - Research Intern - AI Systems and Architecture

Microsoft, United States (On-Site)

Grizmo Labs 🌐 - DevOps Engineer

Grizmo Labs 🌐, India (Hybrid)

Wind River Systems - Senior Linux Solutions Architect

Wind River Systems, India (On-Site)

ByteDance - Software Engineer, Cloud Native Platform

ByteDance, United States (On-Site)

Get notifed when new similar jobs are uploaded