Software Reliability Engineer (Observability)

13 Hours ago • All levels • Frontend Development • Undisclosed

About the job

Job Description

Canva is seeking a Software Reliability Engineer (Observability) to build and improve their observability platform and tooling. Responsibilities include providing technical leadership, optimizing the tracing platform, improving operational effectiveness, advocating for best practices, and improving the user experience. The ideal candidate will have strong coding skills in Python, Java, or Golang, deep knowledge of computer engineering, experience with AWS and Kubernetes, and familiarity with observability tools like Elasticsearch, Grafana, and Jaeger. The role involves working with a team to maintain tracing libraries and infrastructure, error reporting, and handling guidelines, ensuring the scalability and reliability of Canva's platform for developers.
Must have:
  • Proficient in Python, Java, or Golang
  • Deep knowledge of Computer Engineering
  • AWS (EC2, EKS, Lambda etc.) experience
  • Kubernetes experience
  • Observability tooling experience (Elasticsearch, Grafana, Jaeger)
  • Experience with highly available distributed systems
Good to have:
  • OpenTelemetry experience
  • Java or TypeScript application code experience
  • Experience building monitoring infrastructure at scale
  • Experience with ClickHouse
  • Experience with data security and PII detection
Perks:
  • Equity packages
  • Inclusive parental leave policy
  • Annual Vibe & Thrive allowance
  • Flexible leave options

Job Description

Join the team redefining how the world experiences design.

Hey, g'day, mabuhay, kia ora, 你好, hallo, vítejte!

Thanks for stopping by. We know job hunting can be a little time consuming and you're probably keen to find out what's on offer, so we'll get straight to the point.

Where and how you can work

Our flagship campus is in Sydney. We also have a campus in Melbourne and co-working spaces in Brisbane, Perth and Adelaide. But you have choice in where and how you work, we trust our Canvanauts to choose the balance that empowers them and their team to achieve their goals.

What you’d be doing in this role

As Canva scales change continues to be part of our DNA. But we like to think that's all part of the fun. So this will give you the flavour of the type of things you'll be working on when you start, but this will likely evolve.

At the moment, this role is focused on:

  • Being responsible for building and improving our observability platform and tooling, which is used by all Canva engineers.
  • Providing technical leadership and expertise to drive pragmatic solutions and dive into impactful design decisions.
  • Brainstorming, researching and prototyping to optimize our tracing platform, improve our operational effectiveness and increase reliability.
  • Being proactive in improving the tracing user experience and advocating for best practices.
  • Participating in team ceremonies, knowledge sharing and brainstorming sessions.
  • Becoming an observability champion, evangelising best practices and guiding other Canvanauts in the observability space.
  • Finding ways to improve the use of traces and provide better insights to our engineers.

You're probably a match if

  • You are proficient and happy to code in Python, Java or Golang.
  • You have deep knowledge and understanding of Computer Engineering fundamentals and first principles.
  • You have a solid knowledge of AWS (EC2, EKS, Lambda, SQS, Kinesis, S3) or equivalent.
  • You have experience deploying and running containerized workloads on a platform like Kubernetes.
  • You have experience with Observability Tooling – having competency with tools like Elasticsearch, Grafana, Sentry, Jaegar Tracing or similar.
  • Experience running highly available and reliable distributed systems, with highly scalable data stores.
  • You are proficient with infrastructure-as-code - we’re a Terraform shop, but strong experience with other IaC tools will do the trick.

Not essential; but helpful experience!

  • You have experience with OpenTelemetry because it underpins a lot of the infrastructure and tooling that the team owns.
  • You have experience writing application code in Java or frontend code in TypeScript, since we also maintain the tracing libraries.
  • You have experience building and running monitoring infrastructure at scale. For example, Petabyte-scale Elasticsearch clusters or similar databases.
  • You have experience with data handling at scale.
  • You have experience with Clickhouse.
  • You have experience with data security, data obfuscation and PII detection.

About the team

You’ll join The Observability Traces & Exceptions Team, responsible for operational insights inside Canva. Our goal is to provide our development team with world-class tools to view how their services are performing in production. We achieve this by combining industry-leading third-party solutions with our own solutions developed in-house.

We work across the entire stack maintaining our TypeScript and Java tracing libraries, our tracing infrastructure, error reporting libraries and error handling guidelines to name just a few. As we scale all of these areas, we require more sophisticated solutions to ensure that Canva developers can continue to grow without compromising on reliability or availability.

What's in it for you?

Achieving our crazy big goals motivates us to work hard - and we do - but you'll experience lots of moments of magic, connectivity and fun woven throughout life at Canva, too. We also offer a range of benefits to set you up for every success in and outside of work.

Here's a taste of what's on offer:

  • Equity packages - we want our success to be yours too
  • Inclusive parental leave policy that supports all parents & carers
  • An annual Vibe & Thrive allowance to support your wellbeing, social connection, office setup & more
  • Flexible leave options that empower you to be a force for good, take time to recharge and supports you personally

Check out lifeatcanva.com for more info.

Other stuff to know

We make hiring decisions based on your experience, skills and passion, as well as how you can enhance Canva and our culture. When you apply, please tell us the pronouns you use and any reasonable adjustments you may need during the interview process.

We celebrate all types of skills and backgrounds at Canva so even if you don’t feel like your skills quite match what’s listed above - we still want to hear from you!

Please note that interviews are conducted virtually.

View Full Job Description

Add your resume

80%

Upload your resume, increase your shortlisting chances by 80%

About The Company

New South Wales, Australia (Remote)

Sydney, New South Wales, Australia (Remote)

Perth, Western Australia, Australia (Remote)

Melbourne, Victoria, Australia (Remote)

London, England, United Kingdom (Remote)

London, England, United Kingdom (Remote)

Warsaw, Masovian Voivodeship, Poland (Remote)

Sydney, New South Wales, Australia (Remote)

Surry Hills, New South Wales, Australia (Remote)

View All Jobs

Get notified when new jobs are added by Canva

Similar Jobs

N-iX - Senior Android Software Engineer

N-iX, Colombia (Remote)

Rackspace Technology - Sr Big Data Engineer Airflow and Oozie (GCP)

Rackspace Technology, United States (Remote)

Dream Games - Backend Engineer (Part-time)

Dream Games, Türkiye (On-Site)

Microsoft - Software Engineer II - Edge Compute Infrastructure

Microsoft, United States (On-Site)

Meta - Software Engineer, Infrastructure

Meta, United States (On-Site)

N-iX - Senior React Engineer (#2461)

N-iX, Ukraine (Remote)

Sundew - Senior Frontend Developer

Sundew, India (On-Site)

Hedra - Frontend Engineer

Hedra, United States (On-Site)

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

The Walt Disney Company - Software Engineer II

The Walt Disney Company, United States (On-Site)

PlayStation Global - Staff Software Engineer (Cloud Services / Distributed Systems)

PlayStation Global, United States (On-Site)

Sonar Source - Software Engineer (Java)

Sonar Source, Germany (On-Site)

ION - Senior Technical Consultant - Endur

ION, United Kingdom (On-Site)

Microsoft - Software Engineer

Microsoft, (On-Site)

Performio - Staff Engineer

Performio, India (Hybrid)

Luxoft - Lead DevOps Engineer

Luxoft, (Remote)

Velotio Technologies - Lead Engineer (Android)

Velotio Technologies, India (Remote)

Get notifed when new similar jobs are uploaded

Jobs in Surry Hills, New South Wales, Australia

Tesla - Tesla Advisor

Tesla, Australia (On-Site)

Keywords Studios (Player Support) - General Expressions of Interest

Keywords Studios (Player Support), Australia (Hybrid)

Windranger Labs - Blockchain Engineer 区块链工程师

Windranger Labs, Australia (On-Site)

Tesla - Tesla Advisor, Chatswood

Tesla, Australia (On-Site)

Luxoft - Business Analyst - Trade Surveillance

Luxoft, Australia (On-Site)

Get notifed when new similar jobs are uploaded

Frontend Development Jobs

Lakshya Digital - UI Programmer - Freelance

Lakshya Digital, Canada (Remote)

Blitz app - Senior Frontend Engineer (C++)

Blitz app, United States (On-Site)

Animoca Brands - Frontend Developer

Animoca Brands, China (Remote)

Rockstar Games - Senior Frontend Engineer

Rockstar Games, United Kingdom (On-Site)

ION - Front End Developer - Italy

ION, Italy (On-Site)

Playtech - Production QA Engineer

Playtech, Bulgaria (On-Site)

Luxoft - Angular Team Lead

Luxoft, Canada (On-Site)

Get notifed when new similar jobs are uploaded