Location Details: At GoDaddy the future of work looks different for each team. Some teams work in the office full-time; others have a hybrid arrangement (they work remotely some days and in the office some days) and some work entirely remotely. This is a remote position, so you’ll be working remotely from your home. You may occasionally visit a GoDaddy office to meet with your team for events or meetings. Join Our Team GoDaddy is seeking a highly skilled and motivated Site Reliability Engineer (SRE) to join our dynamic team. This role will focus on automating and maintaining our storage infrastructure with a focus on Ceph, ensuring the reliability, scalability, and performance of our systems. What you'll get to do... * Automate and maintain day-to-day operations of storage systems to support application demands. * Develop and maintain tools and automation scripts to streamline storage operations and improve efficiency. * Monitor system performance, identify issues, and implement solutions to ensure high availability and reliability. * Participate in agile concepts such as daily stand-up meetings, task tracking boards, design and code reviews, automated testing, continuous integration, and deployment. * Continuously improve system reliability, performance, and capacity through proactive monitoring, automation, and optimization. Your experience should include... * 2+ years of professional experience with Ceph, working in a production environment * 2+ years of experience in site reliability engineering or a similar role. * 2+ years of professional experience with Ceph, including deployment, configuration, and management of Ceph clusters and systems. * Experience working on Linux/Unix systems, with a focus on automation and operating at scale. * Proficiency in Python or Bash. * Experience with Ansible, Terraform, or SaltStack. * Experience with Nagios-based monitoring tools, such as Icinga2. * Experience with observability tooling, such as Prometheus, Grafana, Mimir, and Loki. * Solid understanding of core networking concepts and protocols, particularly in relation to Linux/Unix systems. You might also have... * Experience with containerization and orchestration tools (e.g., Docker, Kubernetes). * Exposure to and experience working with compute platforms (e.g., OpenStack, AWS). * Familiarity with ability to contribute to CI/CD pipelines and automation workflows.