Senior Cloud Site Reliability Engineer

58 Minutes ago • 5 Years +
Devops

Job Description

Barracuda is seeking a passionate and experienced Senior Cloud Site Reliability Engineer for its Email Protection business unit. This role focuses on ensuring the availability and seamless scaling of high-volume, critical SaaS applications. Responsibilities include supporting application infrastructure, implementing automation for deployments, maintaining self-service platforms, managing service levels, participating in incident response, and contributing to disaster recovery plans. The ideal candidate will have strong technical acumen in operations, automation, and development, working with AWS, Kubernetes, and CI/CD tools to strengthen cyber resilience.
Good To Have:
  • AWS certifications (Solutions Architect, SysOps).
  • Kubernetes certifications (CKA, CKAD).
Must Have:
  • Work with internal customers to understand application design and cloud infrastructure requirements, focusing on scalability and reliability.
  • Implement templates, tools, and scripts for infrastructure deployment to support development teams.
  • Help develop and maintain self-service platforms for Product Engineering team.
  • Implement and monitor SLIs, SLOs, and SLAs across services.
  • Participate in incident response processes and contribute to post-incident reviews.
  • Help maintain disaster recovery and business continuity plans.
  • Implement non-functional requirements including security, performance, and monitoring.
  • Assist with architecture implementation, solution design, and code reviews.
  • Implement solutions using AWS, Kubernetes, GitHub Actions, Jenkins, Terraform, and other current technologies.
  • Support initiatives to convert manual deployments to automated processes.
  • Maintain and enhance monitoring and reliability systems.
  • Participate in on-call rotation to ensure 24/7 system reliability.
  • 5+ years hands-on infrastructure experience, including 3+ years cloud development and SRE/DevOps roles.
  • Strong knowledge of AWS cloud infrastructure, security, and operations in production environments.
  • Experience with Terraform, CloudFormation, or Pulumi for cloud infrastructure automation.
  • Experience with GitHub, GitHub Actions, Jenkins, and configuration management tools.
  • Knowledge of blue/green, canary, and rolling deployment strategies.
  • Experience with Docker, Kubernetes, and EKS in AWS environments.
  • Solid coding abilities in Python, Go, or similar languages.
  • Strong Linux knowledge including system administration.
  • Experience with monitoring tools like New Relic, CloudWatch, Prometheus, and Grafana.
  • Good debugging and troubleshooting capabilities.
Perks:
  • A team where you can voice your opinion, make an impact, and where you and your experience are valued.
  • Internal mobility – there are opportunities for cross training and the ability to attain your next career step within Barracuda.
  • Equity, in the form of non-qualifying options.

Add these skills to join the top 1% applicants for this job

saas-business-models
problem-solving
github
game-texts
linux
incident-response
aws
prometheus
terraform
new-relic
grafana
ci-cd
docker
kubernetes
python
github-actions
jenkins

Description

Req ID: 26-321

Come join our passionate team! Barracuda is a leading cybersecurity company providing complete protection against complex threats. Our platform protects email, data, applications, and networks with innovative solutions, and a managed XDR service, to strengthen cyber resilience. Hundreds of thousands of IT professionals and managed service providers worldwide trust us to protect and support them with solutions that are easy to buy, deploy, and use.

We are committed to a candidate selection process and work environment that is inclusive and barrier free. To ensure candidates are assessed in a fair and equitable manner, accommodations will be provided to prospective employees in accordance with the Accessibility for Ontarians with Disabilities Act (AODA) and the Ontario Human Rights Code.

Envision yourself at Barracuda

We seek a passionate and experienced Senior Cloud Site Reliability Engineer / (SRE) for the Email Protection business unit with great technical acumen and a strong background in operations, automation, implementation, and development. You will be responsible for ensuring the availability of high volume, critical SaaS applications and seamless scaling. The application portfolio ranges from a broad spectrum of Email Protection products.

What will you be working on:

  • Application Infrastructure Support: Work with internal customers to understand application design and cloud infrastructure requirements, focusing on scalability and reliability
  • Infrastructure Automation: Implement templates, tools, and scripts for infrastructure deployment to support development teams
  • Platform Support: Help develop and maintain self-service platforms for Product Engineering team
  • Service Level Management: Implement and monitor SLIs, SLOs, and SLAs across services
  • Incident Management: Participate in incident response processes and contribute to post-incident reviews
  • Disaster Recovery: Help maintain disaster recovery and business continuity plans
  • Technical Implementation: Implement non-functional requirements including security, performance, and monitoring
  • Solution Implementation: Assist with architecture implementation, solution design, and code reviews
  • Technology Stack Implementation: Implement solutions using AWS, Kubernetes, GitHub Actions, Jenkins, Terraform, and other current technologies
  • Deployment Automation: Support initiatives to convert manual deployments to automated processes
  • Observability Systems: Maintain and enhance monitoring and reliability systems
  • On-Call Duties: Participate in on-call rotation to ensure 24/7 system reliability

What you bring to the role:

  • Technical Expertise: 5+ years hands-on infrastructure experience, including 3+ years cloud development and SRE/DevOps roles
  • Cloud Infrastructure: Strong knowledge of AWS cloud infrastructure, security, and operations in production environments
  • Infrastructure as Code: Experience with Terraform, CloudFormation, or Pulumi for cloud infrastructure automation
  • CI/CD & Automation: Experience with GitHub, GitHub Actions, Jenkins, and configuration management tools
  • Deployment Patterns: Knowledge of blue/green, canary, and rolling deployment strategies
  • Container Orchestration: Experience with Docker, Kubernetes, and EKS in AWS environments
  • Programming: Solid coding abilities in Python, Go, or similar languages
  • Operating Systems: Strong Linux knowledge including system administration
  • Observability: Experience with monitoring tools like New Relic, CloudWatch, Prometheus, and Grafana
  • Problem Solving: Good debugging and troubleshooting capabilities
  • Certifications: AWS certifications (Solutions Architect, SysOps) or Kubernetes certifications (CKA, CKAD) a plus

What you’ll get from us:

A team where you can voice your opinion, make an impact, and where you and your experience are valued. Internal mobility – there are opportunities for cross training and the ability to attain your next career step within Barracuda. In addition, you will receive equity, in the form of non-qualifying options.

#LI-remote

Set alerts for more jobs like Senior Cloud Site Reliability Engineer
Set alerts for new jobs by Barracuda
Set alerts for new Devops jobs in Canada
Set alerts for new jobs in Canada
Set alerts for Devops (Remote) jobs

Contact Us
hello@outscal.com
Made in INDIA 💛💙