THE CHALLENGE
Eventbrite's business continues to grow and scale rapidly, powering millions of events. Event creators and event goers need new tools and technologies that empower them to have the most meaningful live experiences. As a Site Reliability Engineer II, you will be part of a team that ensures that the Eventbrite platform runs efficiently, securely, and at scale.
THE TEAM
We're a people-focused Engineering organization: the people on our team value working together in small teams to solve significant problems, supporting an active culture of mentorship and inclusion, and pushing themselves to learn new things daily. Pair programming, weekly demos, tech talks, and quarterly hackathons are at the core of how we’ve built our team and product. We believe in engaging with the community, regularly hosting free events with some of the top technical speakers, and actively contributing to open source software (check out Britecharts as an example!). Our technology spans the web, mobile, API, big data, machine learning, search, physical point of sale, and scanning systems.
THE ROLE
As a Site Reliability Engineer II, you’ll have an opportunity to work alongside our SRE team that supports the Eventbrite platform. You will embrace the SRE model, and work with other senior leaders on the team to modernize our tech stack. Typical activities will include:
- Oversee the smooth operation of Eventbrite’s platform from frontend to backend.
- Perform requirements and capacity analysis; design, launch and extend microservices, perform monitoring and troubleshooting of complex systems and production issues and identifying root causes
- Support the transition from self-hosted services to cloud-managed services.
- Identify and execute on opportunities to optimize existing systems, improve infrastructure, and eliminate work through automation in partnership with other teams throughout Eventbrite
- Partner closely with the Security team to ensure that Eventbrite’s systems, services and data are protected.
- You will use tools like Terraform, Cloudformation, Puppet, etc. to automate infrastructure provisioning
- Automate manual tasks required for support of the Eventbrite platform
- You'll be working closely with AWS, to containerize applications, provision infrastructure and support the transition from self-hosted services to cloud-managed services
THE TECH STACK
Our primary stack is Python and Django based microservices, running on AWS with MySQL backends. Some of the other tools that we use heavily are Redis, RabbitMQ, Elasticsearch, Kafka, Git, and an endless supply of coffee.
THE SKILLSET
- 2+ years relevant industry experience in SRE, Cloud Engineering or DevOps roles
- Considerable experience with Linux systems administration (Ubuntu experience appreciated)
- Solid programming skills (Python! or Ruby or Go) and understanding and application of computer science fundamentals: data structures, algorithms, and design patterns
- Experience with AWS and cloud architectures/services.
- Familiarity with the container and container orchestration space (Docker, Kubernetes, etc.)
- Experience working with infrastructure provisioning tools like CloudFormation, Terraform, Chef, Puppet, or others
- Experience enabling CI/CD pipelines using tools such as Jenkins, CircleCI, Gitlab, or others
- Ability to handle on-call duty with the team
- Track record of delivering successful solutions and collaborating with others
BONUS POINTS FOR
- Familiarity with the Hashicorp suite for service discovery and orchestration systems such as Consul, Nomad, and Vault.
- Active Eventbrite user with a passion for live events