Site Reliability Engineer

49 Minutes ago • 5 Years +

Devops

Job Description

We are looking for an experienced Site Reliability Engineer to lead infrastructure automation, CI/CD workflows, and deployment operations for a custom web platform. The role involves working with a modern DevOps stack including GitHub Actions, GCP, Kubernetes, Terraform, PostgreSQL, CodeDeploy, and Cloudflare. This hands-on position focuses on enhancing developer productivity, optimizing deployment pipelines, and ensuring system reliability and scalability.

Good To Have:

Experience supporting high-traffic or distributed systems.
Knowledge of container orchestration.
Familiarity with secrets management tools (e.g., AWS Secrets Manager, Vault).
Exposure to infrastructure cost optimization and budgeting practices.

Must Have:

Design, implement, and maintain CI/CD pipelines using GitHub Actions.
Manage and optimize GCP infrastructure.
Build robust deployment workflows leveraging CodeDeploy, containers, and automated rollback strategies.
Configure and manage Cloudflare settings including DNS, security rules, and CDN optimizations.
Support PostgreSQL database operations.
Implement monitoring, logging, and alerting systems to proactively address stability and performance issues.
Automate manual processes to improve efficiency and reduce operational risk.
Contribute to disaster recovery planning, high availability design, and system hardening efforts.
Collaborate cross-functionally with engineering teams to enforce DevOps best practices.
5+ years of experience as a Site Reliability Engineer / DevOps Engineer with CI/CD systems for web platforms.
Strong hands-on experience with GitHub Actions, Kubernetes, and GCP.
Proven ability to automate infrastructure, build repeatable pipelines, and manage cloud environments at scale.
Experience with CodeDeploy, PostgreSQL, and infrastructure monitoring tools.
Familiarity with Cloudflare configuration and performance/security features.
Solid scripting skills (Bash, Python) for automation and tool development.
Strong understanding of system reliability, scalability techniques, and security best practices.

Perks:

Work at the cutting edge of AI and web technology.
Build real-world, user-facing AI experiences.
Collaborate with a world-class team of AI, product, and platform engineers.
Enjoy a flexible, creative, and fast-paced environment with lots of ownership.
Benefits may vary by location due to regional regulations and company policies.

Add these skills to join the top 1% applicants for this job

real-time-vfx

forecasting-budgeting

github

game-texts

storytelling

postgresql

prototyping

dns

aws

prometheus

grafana

terraform

google-cloud-platform

ci-cd

kubernetes

python

github-actions

bash

machine-learning

Description

-----------

We’re seeking an experienced Site Reliability Engineer to join our team and lead infrastructure automation, CI/CD workflows, and deployment operations for a custom web platform. You’ll be working with a modern DevOps stack including GitHub Actions, GCP, Kubernetes, Terraform, PostgreSQL, CodeDeploy, and Cloudflare to ensure our platform is robust, scalable, and ready for growth.

This is a hands-on role focused on enabling developer productivity, improving deployment pipelines, and ensuring system reliability at scale.

Key Responsibilities:

Design, implement, and maintain CI/CD pipelines using GitHub Actions and related tools
Manage and optimize GCP infrastructure
Build robust deployment workflows leveraging CodeDeploy, containers, and automated rollback strategies
Configure and manage Cloudflare settings including DNS, security rules, and CDN optimizations
Support PostgreSQL database operations in collaboration with platform teams
Implement monitoring, logging, and alerting systems to proactively address stability and performance issues
Automate manual processes to improve efficiency and reduce operational risk
Contribute to disaster recovery planning, high availability design, and system hardening efforts
Collaborate cross-functionally with engineering teams to enforce DevOps best practices

Requirements

------------

5+ years of experience as a Site Reliability Engineer / DevOps Engineer working with CI/CD systems for medium to large-scale web platforms
Professional experience with GCP - Google Cloud Platform
Strong hands-on experience with GitHub Actions, Kubernetes and everything GCP.
Proven ability to automate infrastructure, build repeatable pipelines, and manage cloud environments at scale
Experience with CodeDeploy, PostgreSQL, and infrastructure monitoring tools (e.g., Prometheus, Datadog, Grafana, CloudWatch)
Familiarity with Cloudflare configuration and performance/security features
Solid scripting skills (Bash, Python, etc.) for automation and tool development
Strong understanding of system reliability, scalability techniques, and security best practices

Nice to Have:

Experience supporting high-traffic or distributed systems
Knowledge of container orchestration
Familiarity with secrets management tools (e.g., AWS Secrets Manager, Vault)
Exposure to infrastructure cost optimization and budgeting practices

About Pixomondo and our Innovation Lab Team

PXO, a Sony Pictures Entertainment company, creates industry-leading Visualization, Virtual Production, and Visual Effects for premium Film and Episodic content. Through its 23-year history, the Oscar, BAFTA, & Emmy-winning creative and technology company has been a trusted partner for storytellers and showrunners worldwide

PXO’s Innovation Lab is where the future of content creation is being built. Backed by Sony and powered by a world-class team of disruptors, this high-tech hub explores emerging technologies like AI, machine learning, real-time engines, robotics, and new media workflows.

We’re not just redefining how VFX and animation are made, we’re shaping the next wave of storytelling across all mediums. Our team thrives on experimentation, rapid prototyping, and pushing the boundaries of what’s possible, using cutting-edge hardware and software to challenge industry norms and invent bold new ways to create."

Why Join Us?

Work at the cutting edge of AI and web technology.
Build real-world, user-facing AI experiences—not just chatbots.
Collaborate with a world-class team of AI, product, and platform engineers.
Enjoy a flexible, creative, and fast-paced environment with lots of ownership.

Benefits

--------

Individual salaries within this range will be dependent upon skills, experience, and qualifications. Benefits may vary by location due to regional regulations and company policies.

Pixomondo is an equal opportunity employer. We evaluate qualified applicants without regard to race, color, religion, sex, national origin, disability, veteran status, age, sexual orientation, gender identity, or other protected characteristics.

PXO does not accept resumes from recruiters. Unsolicited resumes are accepted directly from candidates only. PXO will not pay any fees associated with unsolicited resumes.

Apply for this job

Set alerts for more jobs like Site Reliability Engineer

Set alerts for new jobs by Pixomondo

Set alerts for Devops (Remote) jobs