Service Reliability Engineer - ASE Data Platform

13 Minutes ago • 5-8 Years
Devops

Job Description

The Apple Services Engineering team is seeking a Service Reliability Engineer for its Data Platform. This role involves providing a robust cloud platform for mission-critical systems, ensuring constant uptime, seamless scalability, and supporting new applications. The SRE will operate, monitor, and triage production environments, automate deployments, participate in capacity planning, and troubleshoot customer concerns. They will also collaborate with developers to improve system stability, security, and scalability.
Good To Have:
  • Experience working on multiple cloud environments like AWS and GCP.
  • Experience partnering with development teams.
  • Strong verbal communication.
Must Have:
  • Operate, monitor, and triage production and non-production environments.
  • Automate deployment and orchestration of services into cloud environments.
  • Actively participate in capacity planning, scale testing, and disaster recovery exercises.
  • Troubleshoot customer concerns for ML Tuning and inference endpoints on Ray.
  • Design and implement RESTful/RPC API and services using Golang OR Python.
  • Implement SLO/SLI, error budget reporting for various customers.
  • Deep expertise of Apache Ray.
  • 2+ years experience with Golang and Python.
  • Experience designing and implementing RESTful/RPC API and services.
  • Experience working with Kubernetes and AWS.

Add these skills to join the top 1% applicants for this job

budget-management
game-texts
quality-control
aws
kubernetes
python

The Apple Services Engineering team (ASE) is one of the most exciting examples of Apple’s long-held passion for combining art and technology. These are the people who power the App Store, Apple TV, Apple Music, Apple Podcasts, and Apple Books. And they do it at an extensive scale, meeting our high expectations with dedication to deliver a huge variety of entertainment in over 35 languages to more than 150 countries. These engineers build secure, end-to-end solutions. They develop the custom software used to process all the creative work, the tools that providers use to deliver that media, all the server-side systems, and the APIs for many Apple services. Thanks to Apple’s unique integration of hardware, software, and services, engineers here partner to get behind a single unified vision. That vision always includes a deep commitment to strengthening Apple’s privacy policy, one of our core values. Although services are a bigger part of Apple’s business than ever before, these teams remain small, and multi-functional, offering greater exposure to the array of opportunities here.

As a Service Reliability Engineer, you will be responsible for providing the platform for mission critical cloud systems to maintain constant uptime, scale seamlessly, and allow for new applications and services to flourish. The successful candidate will be highly self-motivated with a passion for excellence, quality and detail. The SRE will not only support operations but also work closely with the developers and architects within the team to aid in the design and assist with the implementation to improve stability, security and scalability.

  • Operate, monitor, and triage all aspects of our production and non-production environments.
  • Automate deployment and orchestration of services into the cloud environment as well as other routine processes.
  • Work on multiple cloud environment like AWS and GCP.
  • Actively participate in capacity planning, scale testing, and disaster recovery exercises.
  • Interact with and support partner teams, including Engineering, QA, and program management.
  • Troubleshoot customer concerns for ML Tuning and inference endpoints on Ray.
  • Designing and implementing RESTful/RPC API and services using Golang OR Python.
  • Implement SLO/SLI, error budget reporting for various customers

Key Qualifications

  • Bachelor's Degree in Computer Science, an engineering-related field, or equivalent related experience.
  • 5 to 8 years experience in a Service Reliability Engineering, DevOps, or Infrastructure focused role.
  • Deep expertise of Apache Ray.
  • 2+ years experience with Golang and Python.
  • Experience designing and implementing RESTful/RPC API and services.
  • Experience working with Kubernetes and AWS

Additional Requirements

  • Experience working on multiple cloud environment like AWS and GCP.
  • Experience partnering with development teams
  • Strong verbal communication

Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant.

Set alerts for more jobs like Service Reliability Engineer - ASE Data Platform
Set alerts for new jobs by Apple
Set alerts for new Devops jobs in United States
Set alerts for new jobs in United States
Set alerts for Devops (Remote) jobs

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug
Contact Us
hello@outscal.com
Made in INDIA 💛💙