About the team:
We are seeking an experienced Site Reliability Engineer (SRE) to join our dynamic Cloud Infrastructure team. This role demands
Our platform empowers millions of buyers and sellers in more than 190 markets around the world. We’re committed to pushing boundaries and leaving our mark as we reinvent the future of ecommerce for enthusiasts.
Cloud SRE team is pivotal to ensure reliable, secure and efficient cloud infrastructure. We are responsible to implement robust architecture designs, redundancy, comprehensive monitoring and observability solutions, streamline processes and automating routine tasks to enhance productivity, ensure consistency and adherence to security regulatory standards, and foster a culture of ongoing enhancement.
This role demands a deep understanding of cloud-native technologies, particularly containers and Kubernetes, along with strong programming skills in languages such as Go and Python. The ideal candidate will have a proven track record of at least 3 years in the field, focusing on the design, development, deployment, and operation of self-service platforms that facilitate the lifecycle management of applications supporting eBay's products and services.
What you will accomplish:
- Collaborate with internal customers and partners to deliver key business outcomes.
- Ensure that cloud products are reliable, efficient, and compliant with eBay's security and operational standards.
- Enhance observability practices to ensure comprehensive monitoring and alerting across cloud services.
- Drive improvements in CI/CD processes to increase deployment velocity and reliability.
- Design, implement, and manage modern cloud-native products and services that provide competitive advantages for eBay.
- Lead initiatives to adapt and integrate the latest open-source tools and technologies within eBay's infrastructure.
What you will bring:
- Minimum of 3+ years of programming experience with Go and Python.
- proven experience in implementing large-scale, distributed, high-availability, fault-tolerant systems and infrastructure in a production environment.
- Proficiency in delivering products within a multi-functional team environment.
- Demonstrated expertise in observability tools and practices, ensuring system reliability and performance.
- Extensive experience with Kubernetes as an SRE, or related cloud infrastructure and cloud-native technologies. Experience in developing with Kubernetes and/or building Kubernetes controllers is highly desirable.
- Deep understanding of API design and RESTful principles, with experience in building web services at scale.
- Experience in participating in open-source standards and contributing to open-source projects is a plus.