SummaryBy Outscal
Vimeo is looking for a Data Engineer III to improve the reliability of its data platforms and pipelines. This role requires 2+ years of experience working on Linux and cloud environments, container orchestration platforms, and distributed data stores. Strong coding skills in Java, Python, or Scala are also essential.
Data Engineer III
Our mission at Vimeo is to help businesses drive impact through video. Thanks to our strong community of video professionals, the volume of video consumption, and uploaded content, data is a crucial ingredient for success and one of our economic moats.
Vimeo supports about 200M registered video creators, billions of monthly video views, and hundreds of millions of monthly active users. We are looking for a Data platform engineer to help us improve the reliability of our data platforms and pipelines serving billions of events and terabytes of data daily.
You’ll work closely with different data engineering teams on their incident management process, post-mortem, root cause analysis, and preventing incidents recurrence.
If you are passionate about data reliability, scale, and automation, we should talk soon!
What You'll Do
- You will collaborate with engineering teams to improve, maintain, performance tune and capacity plan for Vimeo’s data platforms and infrastructure.
- Design business continuity and disaster recovery plans and processes and work with the engineering team in implementation.
- You will drive the incident management process for our data platform, working with our partner teams to perform incident post-mortems, root cause analysis, and prevent recurring incidents.
- You will lead the standard change and release management process, automate and promote related best practices across engineering teams and help Vimeo to meet and maintain legal compliance status.
- Build intelligent monitoring over data pipelines and infrastructure to achieve early and automated anomaly detection.
- You'll work closely with software developers to build an end-to-end automated testing framework and system-level testing environment.
- Participate in an on-call rotation.
What To Bring
- You have production experience with distributed data stores, e.g. Hbase, zookeeper, Kafka Own, manage, monitor, and optimize the reliability and overall health of our development and production environments
- Detailed problem-solving approach, coupled with a strong sense of ownership and drive
- A passionate bias to action and passion for delivering high-quality data solutions
- 2+ years of experience working on Linux environment, and proficient with cloud environment (AWS, GCP)
- Experience with container orchestration platforms, particularly Kubernetes, for managing and deploying data processing and analysis applications.
- Experience coding in one or more of the following programming languages: Python, Java (mandatory), or Scala
- 2+ years of hands-on experience in Reliability Engineering for high-performant, scalable, and distributed data systems with a focus on automation
- Experience in config management systems like Chef, Puppet, Ansible, or Terraform.
- Deep understanding of CI/CD principles, familiar with source control systems (Git)
- Work with peer SREs to roll out changes to our production environment and help mitigate data-related production incidents.
- Experience with a Change Data Capture system, such as Debezium, is a plus.
- Attention to detail and quality with excellent problem-solving and interpersonal skills
- A bonus - you have some experience in data warehousing and data engineering