Senior Site Reliability Engineer

2 Hours ago • All levels

Job Summary

Job Description

Snyk is looking for a Senior Site Reliability Engineer to join their team. The role involves designing, deploying, and maintaining infrastructure on AWS, managing Kubernetes clusters, and utilizing tools like ArgoCD, Prometheus, and Terraform. The SRE will also troubleshoot system issues, ensure high availability, and collaborate with development teams. The job involves responsibilities like designing and deploying infrastructure on AWS, managing Kubernetes clusters, implementing monitoring and alerting systems, automating infrastructure provisioning, implementing networking best practices and troubleshooting complex system issues.
Must have:
  • Experience with AWS services (VPC, EC2, EKS, RDS, IAM).
  • Deep understanding of Kubernetes architecture and cluster management.
  • Experience with Cloudflare products (DNS, Zero Trust, WAF, CDN).
  • Proficiency in Prometheus + Grafana monitoring stack.
  • Experience with Calico for managing Kubernetes network policies.
  • Solid experience with Graylog and OpenSearch for logging.
  • Proficient with Infrastructure as Code tools, especially Terraform.
  • Experience with CI/CD pipelines and GitOps using ArgoCD.
  • Strong scripting and automation skills in Bash and/or Python.
  • Solid knowledge of networking principles (TCP/IP, DNS, HTTP/HTTPS).
Good to have:
  • Familiarity with incident management practices.
  • Understanding of Zero Trust security models.
  • Exposure to Service Mesh (Istio, Linkerd).
  • Experience with cost optimization and cloud spend monitoring.
  • Familiarity with Linux system administration.
  • Knowledge of RBAC and IAM in AWS and Kubernetes.
Perks:
  • Flexible working hours, work-from-home allowances, in-office perks, and time off for learning and self development
  • Generous vacation and wellness time off, country-specific holidays, and 100% paid parental leave for all caregivers
  • Health benefits, employee assistance plans, and annual wellness allowance
  • Country-specific life insurance, disability benefits, and retirement/pension programs, plus mobile phone and education allowances

Job Details

Every day, the world gets more digital thanks to tens of millions of developers building the future faster than ever. But with exponential growth comes exponential risk, as outnumbered security teams struggle to secure mountains of code. This is where Snyk (pronounced “sneak”) comes in. Snyk is a developer security platform that makes it easy for development teams to find, prioritize, and fix security vulnerabilities in code, dependencies, containers, and cloud infrastructure — and do it all right from the start. Snyk is on a mission to make the world a more secure place by empowering developers to develop fast and stay secure.

 

 

Joining Snyk means embracing our core values: One Team, Care Deeply, Customer Centric, and Forward Thinking. As a member of our team, you’ll have the opportunity to thrive in a dynamic environment where fostering collaboration, leading with empathy, driving business impact, and inspiring trust are at the heart of everything we do.

Our Opportunity

Snyk, a leader in developer security, has acquired Probely, a modern Dynamic Application Security Testing (DAST) provider based in Portugal, with coverage of API security testing and web applications.

We are seeking a skilled and proactive Site Reliability Engineer (SRE) to join our team and support our hypergrowth by building scalable, reliable, and secure cloud infrastructure. You will be responsible for ensuring the performance and uptime of our systems while adopting DevOps best practices and leveraging modern tools.

You’ll Spend Your Time:

  • Design, deploy, and maintain infrastructure on AWS, including VPC, EC2, RDS, IAM and EKS clusters.
  • Manage Kubernetes clusters across multiple environments with a focus on performance, security, and availability.
  • Utilize ArgoCD, Kustomize and Helm for continuous deployment and GitOps workflows.
  • Implement and manage monitoring and alerting systems using Prometheus, Grafana, and custom exporters.
  • Maintain centralized logging and observability using Graylog and OpenSearch.
  • Automate infrastructure provisioning with Terraform and custom scripting in Python or Bash.
  • Implement best practices around networking, including VPN, load balancing, routing, and firewalls.
  • Troubleshoot complex system issues across network, infrastructure, and application layers.
  • Ensure high availability, scalability, and disaster recovery across all systems.
  • Collaborate with development and operations teams to improve deployment processes and infrastructure resiliency.

What You’ll Need:

  • Strong hands-on experience with AWS services (VPC, EC2, EKS, RDS, IAM).
  • Deep understanding of Kubernetes architecture and day-to-day cluster management.
  • Experience with Cloudflare products (DNS, Zero Trust, WAF, CDN).
  • Proficiency in the Prometheus + Grafana monitoring stack.
  • Strong with Calico for managing Kubernetes network policies.
  • Solid experience with Graylog and OpenSearch for logging and search analytics.
  • Proficient with Infrastructure as Code tools, especially Terraform, Kustomize and Helm.
  • Experience with CI/CD pipelines and GitOps practices using ArgoCD.
  • Strong scripting and automation skills in Bash and/or Python.
  • Solid knowledge of networking principles (TCP/IP, DNS, HTTP/HTTPS, VPNs, security groups, etc.).

We’d be Lucky if You: 

  • Familiarity with incident management practices (on-call, runbooks, postmortem, disaster recovery).
  • Understanding of Zero Trust security models and security best practices in cloud environments.
  • Exposure to Service Mesh (Istio, Linkerd) and container networking.
  • Experience with cost optimization and cloud spend monitoring.
  • Familiarity with Linux system administration and shell scripting.
  • Knowledge of RBAC and IAM in AWS and Kubernetes.

#LI-CR1 #LI-Hybrid

 

We care deeply about the warm, inclusive environment we’ve created and we value diversity – we welcome applications from those typically underrepresented in tech. If you like the sound of this role but are not totally sure whether you’re the right person, do apply anyway!

 

About Snyk

Snyk is committed to creating an inclusive and engaging environment where our employees can thrive as we rally behind our common mission to make the digital world a safer place. From Snyk employee resource groups, to global benefits that help our employees prioritize their health, wellness, financial security, and a work/life blend, we aim to support our employees along their entire journeys here at Snyk.

Benefits & Programs

Prioritize health, wellness, financial security, and life balance with programs tailored to your location and role.

  • Flexible working hours, work-from home allowances, in-office perks, and time off for learning and self development
  • Generous vacation and wellness time off, country-specific holidays, and 100% paid parental leave for all caregivers
  • Health benefits, employee assistance plans, and annual wellness allowance
  • Country-specific life insurance, disability benefits, and retirement/pension programs, plus mobile phone and education allowances

Similar Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Similar Skill Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Jobs in Lisbon, Lisbon, Portugal

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Similar Category Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

About The Company

London, England, United Kingdom (On-Site)

Boston, Massachusetts, United States (On-Site)

Cluj County, Romania (On-Site)

London, England, United Kingdom (Hybrid)

London, England, United Kingdom (On-Site)

London, England, United Kingdom (Hybrid)

London, England, United Kingdom (On-Site)

London, England, United Kingdom (On-Site)

Boston, Massachusetts, United States (On-Site)

London, England, United Kingdom (Hybrid)

View All Jobs

Get notified when new jobs are added by Snyk

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug