Cloud Site Reliability Staff Developer

undefined ago • 10 Years + • Devops • $122,000 PA - $162,000 PA

Job Summary

Job Description

Barracuda is seeking a passionate and experienced Site Reliability Staff Engineer (SRE) for its Managed Service Provider (MSP) and Managed XDR business units. This role focuses on ensuring the availability and seamless scaling of high-volume, critical SaaS applications. Responsibilities include designing application infrastructure, automating deployments, leading architectural decisions, developing self-service platforms, and managing incidents. The ideal candidate will have extensive experience in AWS, Kubernetes, CI/CD, and programming, contributing to a positive team culture and participating in on-call duties.
Must have:
  • Ensure availability of high volume, critical SaaS applications and seamless scaling.
  • Engage with internal customers for application design and cloud infrastructure needs.
  • Create and design templates, tools, and accelerators for deployment infrastructure.
  • Lead architectural decisions and approve major system design changes.
  • Design and develop self-service platforms for Product Engineering teams.
  • Define, implement, and track SLIs, SLOs, and SLAs across services.
  • Lead incident response processes and conduct post-incident learning reviews.
  • Develop and maintain disaster recovery and business continuity plans.
  • Plan and implement non-functional requirements including security, performance, deployment frequency, and monitoring.
  • Oversee architecture snapshots, solution design, prototyping, and code reviews.
  • Drive modern solutions using AWS, Kubernetes, GitHub Actions, Jenkins, Terraform, Pulumi.
  • Build support infrastructure for global data pipeline and storage using Databricks, Spark, ELK.
  • Lead initiatives to convert manual deployments to automated processes.
  • Build and enhance monitoring and reliability systems.
  • Participate in on-call rotation to ensure 24/7 system reliability.
  • Mentor junior team members and foster a positive team culture.
  • 10+ years hands-on infrastructure design experience.
  • 5+ years cloud development experience.
  • 3+ years in SRE/DevOps roles.
  • Deep expertise in AWS cloud infrastructure development, security, and operations.
  • Extensive experience with Terraform, CloudFormation, Pulumi, Crossplane.
  • Strong background with GitHub, GitHub Actions, Jenkins, Packer, Ansible, Puppet.
  • Expertise in blue/green, canary, rolling deployments, and draining strategies.
  • Comprehensive experience with Docker, Kubernetes, and EKS in AWS environments.
  • Strong coding abilities in Python, Go, Ruby.
  • Advanced Linux knowledge including system internals.
  • Extensive experience with New Relic, Elastic APM, CloudWatch, Prometheus, Grafana.
  • Experience with Databricks, Apache Spark, Kafka, and DataStage.
  • Strong systematic debugging and troubleshooting capabilities.
  • Excellent verbal and written communication skills.
Good to have:
  • AWS certifications (Solutions Architect, DevOps)
  • Kubernetes certifications (CKA, CKAD, CKS)
Perks:
  • A team where you can voice your opinion, make an impact, and where you and your experience are valued.
  • Internal mobility with opportunities for cross training and career advancement.
  • Equity in the form of non-qualifying options.

Job Details

Description

Req ID: 26-124

Managed Service Provider (MSP) and Managed Extended Detection and Response (XDR)

Come join our passionate team! Barracuda is a leading cybersecurity company providing complete protection against complex threats. Our platform protects email, data, applications, and networks with innovative solutions, and a managed XDR service, to strengthen cyber resilience. Hundreds of thousands of IT professionals and managed service providers worldwide trust us to protect and support them with solutions that are easy to buy, deploy, and use.

We are committed to a candidate selection process and work environment that is inclusive and barrier free. To ensure candidates are assessed in a fair and equitable manner, accommodations will be provided to prospective employees in accordance with the Accessibility for Ontarians with Disabilities Act (AODA) and the Ontario Human Rights Code.

Envision yourself at Barracuda

We seek a passionate and experienced Site Reliability Staff Engineer (SRE) for the Managed Service Provider (MSP)and Managed XDR business units with great technical acumen and a strong background in operations, automation, implementation, and development.

As a Staff SRE, you will be responsible for ensuring the availability of high volume, critical SaaS applications and seamless scaling. The application portfolio ranges from a broad spectrum of MSP and XDR products.

What will you be working on:

  • Application Infrastructure Design: Engage with internal customers to understand application design and cloud infrastructure needs, focusing on scalability, security, and reliability
  • Infrastructure Automation: Create and design templates, tools, and accelerators for deployment infrastructure to support development teams
  • Architectural Leadership: Lead architectural decisions and approve major system design changes, implementing contemporary architectural patterns
  • Platform Development: Design and develop self-service platforms for Product Engineering teams
  • Service Level Management: Define, implement, and track SLIs, SLOs, and SLAs across services
  • Incident Management: Lead incident response processes and conduct post-incident learning reviews
  • Disaster Recovery: Develop and maintain disaster recovery and business continuity plans
  • Technical Design: Plan and implement non-functional requirements including security, performance, deployment frequency, and monitoring
  • Solution Architecture: Oversee architecture snapshots, solution design, prototyping, and code reviews
  • Technology Stack Implementation: Drive modern solutions using AWS, Kubernetes, GitHub Actions, Jenkins, Terraform, Pulumi, and other current technologies
  • Data Infrastructure: Build support infrastructure for global data pipeline and storage using Databricks, Spark, and ELK stack
  • Deployment Automation: Lead initiatives to convert manual deployments to automated processes
  • Observability Systems: Build and enhance monitoring and reliability systems
  • On-Call Duties: Participate in on-call rotation to ensure 24/7 system reliability
  • Team Development: Mentor junior team members and foster a positive team culture

What you bring to the role:

  • Technical Expertise: 10+ years hands-on infrastructure design experience, including 5+ years cloud development and 3+ years in SRE/DevOps roles
  • Cloud Infrastructure: Deep expertise in AWS cloud infrastructure development, security, and operations with proven success in large-scale production environments
  • Infrastructure as Code: Extensive experience with Terraform, CloudFormation, Pulumi, and Crossplane for cloud infrastructure automation
  • CI/CD & Automation: Strong background with GitHub, GitHub Actions, Jenkins, Packer, Ansible, and Puppet
  • Deployment Patterns: Expertise in blue/green, canary, rolling deployments, and draining strategies
  • Container Orchestration: Comprehensive experience with Docker, Kubernetes, and EKS in AWS environments
  • Programming: Strong coding abilities in Python, Go, Ruby etc.
  • Operating Systems: Advanced Linux knowledge including system internals
  • Observability: Extensive experience with New Relic, Elastic APM, CloudWatch, Prometheus, and Grafana...
  • Data Engineering: Experience with Databricks, Apache Spark, Kafka, and DataStage
  • Problem Solving: Strong systematic debugging and troubleshooting capabilities
  • Communication: Excellent verbal and written communication skills
  • Certifications: AWS certifications (Solutions Architect, DevOps) and Kubernetes certifications (CKA, CKAD, CKS) a plus

What you’ll get from us:

A team where you can voice your opinion, make an impact, and where you and your experience are valued. Internal mobility – there are opportunities for cross training and the ability to attain your next career step within Barracuda. In addition, you will receive equity, in the form of non-qualifying options.

The anticipated on-target earnings range for this role is CAD 122,000 to CAD 162,000. Actual compensation offered will be dependent upon the individual's skills, experience, and qualifications as they directly relate to the requirements of the position, the budget for the position, and applicable employment laws.

#LI-hybrid

Similar Jobs

appier - Staff/Senior Machine Learning Scientist (Ad Cloud)

appier

Taipei City, Taiwan (On-Site)
2 Months ago
Sierra - Partner Manager, System Integrator

Sierra

San Francisco, California, United States (On-Site)
2 Months ago
appier - Technical Consultant Lead

appier

Seoul, South Korea (On-Site)
2 Months ago
appzen - Enterprise Account Executive

appzen

Chicago, Illinois, United States (Remote)
1 Month ago
Stibo Systems - Senior Field Marketing Manager, Mexico

Stibo Systems

Mexico City, Mexico (Hybrid)
1 Year ago
NVIDIA - Senior System Reliability Engineer

NVIDIA

Santa Clara, California, United States (On-Site)
4 Months ago
appier - Software Engineer, Site Reliability Engineering

appier

Tokyo, Japan (On-Site)
2 Months ago
Nice - Specialist Software Engineer - Java, Angular, AWS

Nice

Pune, Maharashtra, India (Hybrid)
1 Month ago
miniclip - Senior Cloud Engineer - Senior Cloud Engineer I

miniclip

Lisbon, Lisbon, Portugal (On-Site)
2 Months ago
Saviynt - Platform Support Engineer

Saviynt

Bengaluru, Karnataka, India (Hybrid)
2 Weeks ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

C3 IoT - Senior Software Engineer - Machine Learning

C3 IoT

Redwood City, California, United States (On-Site)
1 Month ago
USE Insider - Customer Success Manager (Arabic Speaker)

USE Insider

Istanbul, İstanbul, Türkiye (Hybrid)
2 Weeks ago
deel. - Regional Manager, HR Experience | Benefits

deel.

Poland (Remote)
2 Weeks ago
Diligent Corporation - Staff Software Engineer

Diligent Corporation

Vancouver, British Columbia, Canada (Hybrid)
1 Month ago
Thumbtack - Senior IT Systems Engineer

Thumbtack

Philippines (Remote)
1 Month ago
ElevenLabs - Customer Success

ElevenLabs

United States (Remote)
3 Weeks ago
Alpha Sense - Associate Product Manager, Web Curation

Alpha Sense

New York, United States (On-Site)
2 Months ago
C3 IoT - Senior Director, Strategic Solutions - Federal Health Sector

C3 IoT

Tysons, Virginia, United States (On-Site)
1 Month ago
CyberArk - Senior Enterprise Account Executive

CyberArk

Poland (On-Site)
1 Month ago
Cognite - Field Engineer/Technical Sales - Operations & Maintenance (Oil & Gas)

Cognite

Kuala Lumpur, Federal Territory Of Kuala Lumpur, Malaysia (Remote)
2 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Ottawa, Ontario, Canada

Blazesoft - Front-End Developer

Blazesoft

Vaughan, Ontario, Canada (On-Site)
3 Weeks ago
Keywords Studios - Tax Analyst

Keywords Studios

Montréal, Québec, Canada (Remote)
4 Months ago
GlobalStep - Localization Video game Tester (LQA – Slovak)

GlobalStep

Montreal, Quebec, Canada (On-Site)
2 Months ago
Electronic Arts - Senior Weapon Artist - Iron Man

Electronic Arts

Montreal, Quebec, Canada (Hybrid)
3 Weeks ago
Diligent Corporation - Staff Software Engineer

Diligent Corporation

Vancouver, British Columbia, Canada (Hybrid)
1 Month ago
Lionbridge Games - Account Manager, Games

Lionbridge Games

Montreal, Quebec, Canada (On-Site)
3 Months ago
2K - Technical Director of Gameplay

2K

Burnaby, British Columbia, Canada (On-Site)
2 Months ago
Cineplex - Part Time Cast Member

Cineplex

Owen Sound, Ontario, Canada (On-Site)
1 Year ago
Survay Monkey - Payroll Manager (Fixed Term Contract)

Survay Monkey

Ottawa, Ontario, Canada (Remote)
1 Month ago
Signal Space Lab - Lead Software Programmer

Signal Space Lab

Quebec, Canada (Hybrid)
4 Months ago

Get notifed when new similar jobs are uploaded

Devops Jobs

bytedance - Senior Software Engineer, Edge Cloud Platform

bytedance

Seattle, Washington, United States (On-Site)
7 Months ago
T systems - Cloud Engineer - Azure Admin

T systems

Pune, Maharashtra, India (On-Site)
3 Weeks ago
zoox - Staff/Senior Staff Software Platform Engineer

zoox

Foster City, California, United States (Hybrid)
9 Months ago
Apple - Senior Site Reliability Engineer

Apple

Culver City, California, United States (On-Site)
3 Months ago
Spellbrush - AI Infrastructure Engineer

Spellbrush

San Francisco, California, United States (On-Site)
4 Months ago
Ajmera Infotech - Kubernetes Engineer

Ajmera Infotech

Ahmedabad, Gujarat, India (On-Site)
1 Month ago
Apple - Cloud Infrastructure Engineer - Systems

Apple

Seattle, Washington, United States (On-Site)
2 Months ago
bytedance - Software Engineer, Multi Cloud CDN - San Jose / Seattle / Boston

bytedance

Seattle, Washington, United States (On-Site)
7 Months ago
Zones - Client Solutions Architect

Zones

Philadelphia, Pennsylvania, United States (Remote)
5 Months ago
Rackspace Technology - Senior GCP Cloud Engineer

Rackspace Technology

United States (Remote)
4 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Chelmsford, Massachusetts, United States (Hybrid)

Ottawa, Ontario, Canada (Hybrid)

Scotland, United Kingdom (Remote)

Pennsylvania, United States (Remote)

Reading, England, United Kingdom (Hybrid)

Innsbruck, Tyrol, Austria (Hybrid)

Bengaluru, Karnataka, India (On-Site)

Tokyo, Japan (Remote)

Bengaluru, Karnataka, India (On-Site)

View All Jobs

Get notified when new jobs are added by Barracuda

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug