Sr. Site Reliability Engineer

undefined ago • 5 Years + • Devops • $150,000 PA - $207,000 PA

Job Summary

Job Description

Join Vimeo's Site Reliability & Infrastructure Engineering team to design, develop, deploy, maintain, and optimize the platform powering Vimeo, a critical internet infrastructure application. This role involves working with cloud infrastructure at scale, optimizing performance, troubleshooting outages, building developer toolkits, and managing large distributed databases. SRE at Vimeo encompasses platform engineering, database administration, release engineering, and internal tools.
Must have:
  • Build, secure, and evolve platforms for Vimeo applications.
  • Build and maintain tooling for infrastructure automation.
  • Improve observability and reliability to minimize outages.
  • Write and maintain thorough documentation.
  • Contribute to internal self-service infrastructure platform.
  • Participate in weekly on-call rotation.
  • 5+ years in software development or DevOps.
  • Proficiency in C/C++, Go, Java, Ruby, PHP, Python.
  • Deep understanding of high-scalability distributed systems.
  • Expert-level proficiency with Kubernetes deployments.
  • Strong knowledge of container orchestration, Linux, networking.
  • Significant experience with Google Cloud, AWS.
  • Significant experience with MySQL administration.
  • Experience with Infrastructure as Code (Terraform).
  • Experience with observability systems (Datadog, Grafana, Prometheus, VictoriaMetrics, OpenCensus, Graphite).
Good to have:
  • Knowledge of ArgoCD, Atlantis, Varnish, Memcached, and/or Chef.
  • Experience with generalized build systems (make, bazel, please, etc.).
  • Experience with language-specific build systems (SWC, Turborepo, etc.).
Perks:
  • Variable compensation
  • Restricted Stock Units (RSUs)
  • Paid time off
  • Generous 401k match
  • Commuter benefits
  • Health Savings Account (HSA)
  • Flexible Spending Account (FSA)
  • Fertility reimbursement
  • Group term life insurances
  • Wellbeing resources

Job Details

Do you love working with cloud infrastructure at scale? Optimizing the last bit of performance and efficiency out of applications that get hundreds of thousands of requests per second? Digging deep to determine the root cause of an outage? Building and maintaining infrastructure toolkits that hundreds of developers will use? Maintaining large distributed databases?

Come work on the Site Reliability & Infrastructure Engineering team at Vimeo! Your job will be to design, develop, deploy, maintain, and optimize the platform that powers an application that is part of the infrastructure of the Internet: Vimeo. SRE at Vimeo spans the domains of platform engineering, database administration, release engineering, and internal tools.

What you’ll do:

  • Build, secure, and evolve platforms which power the applications that make up Vimeo
  • Build and maintain tooling which makes manual infrastructure work obsolete and enables self service for engineers
  • Improve observability and reliability of applications to reduce outages to an absolute minimum, while reducing MTTA and MTTR
  • Write and maintain thorough documentation to share with your teammates around the world, allowing them all to function as a cohesive unit
  • Contribute to an internal self-service infrastructure platform used by all engineers for application development and deployment.
  • Participate in a weekly on-call rotation shared between offices in the US and India, which includes responding to production incidents and providing internal support to other engineers at Vimeo
  • Whatever it takes (within reason) to make Vimeo faster, simpler, more scalable, more reliable, and more efficient to operate

Skills and knowledge you should possess:

  • At least 5+ years of professional experience in software development or DevOps with high proficiency in at least one general-purpose programming language (C/C++, Go, Java, Ruby, PHP, Python, etc.)
  • Deep understanding of the architectural patterns of high-scalability distributed systems
  • Expert-level proficiency maintaining, optimizing, and administering Kubernetes deployments
  • Strong knowledge of container orchestration, Linux system internals, networking, and secure computing.
  • Significant experience with major cloud providers (Google Cloud, AWS)
  • Significant experience with deploying and administration of MySQL
  • Experience with “Infrastructure as Code” platforms such as Terraform
  • Experience with observability systems, such as Datadog, Grafana, Prometheus, VictoriaMetrics, OpenCensus, and Graphite

Bonus points (nice skills to have, but not needed):

  • Knowledge of ArgoCD, Atlantis, Varnish, Memcached, and/or Chef
  • Experience with generalized build systems (make, bazel, please, etc.) or language-specific build systems (SWC, Turborepo, etc.).

Similar Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Similar Skill Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Jobs in New York, United States

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Devops Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

About The Company

New York, United States (On-Site)

New York, United States (On-Site)

London, England, United Kingdom (On-Site)

Bengaluru, Karnataka, India (On-Site)

Tokyo, Japan (On-Site)

Bengaluru, Karnataka, India (On-Site)

New York, United States (Hybrid)

New York, United States (On-Site)

View All Jobs

Get notified when new jobs are added by Vimeo

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug