Site Reliability Engineer (SRE)

41 Minutes ago • 2 Years +
Devops

Job Description

Join Unikraft to reshape cloud infrastructure. This role involves maintaining and operating customer deployments, planning software updates, collaborating with engineering, managing monitoring systems, and automating deployment and CI/CD workflows. You will work on cutting-edge technology with OS veterans, focusing on extreme efficiency and scalability in a fully remote, deeply technical environment.
Good To Have:
  • Familiarity with virtualization solutions like QEMU/KVM.
  • Experience with Micro-VMMs like Cloud-Hypervisor or Firecracker.
Must Have:
  • Maintain and operate customer on-prem and cloud deployments, ensuring reliability and rapid troubleshooting.
  • Plan, package, and roll out software updates both internally and to customers, including testing and validation.
  • Collaborate with engineering to ensure quality deployments and maintain a high standard of product reliability.
  • Set up and manage monitoring systems to proactively detect and resolve issues in production environments.
  • Write scripts and automation for deployment, infrastructure management, and CI/CD workflows.
  • Deploy, manage, and troubleshoot Kubernetes clusters for reliable, scalable infrastructure.
  • Build tooling and automation to streamline deployment and platform integration.
  • Contribute to continuous integration pipelines that catch regressions across components and system integrations.
  • Create and maintain clear documentation for systems, processes, and tools.
  • At least 2 years of experience working in high-pressure environments.
  • Proven experience in Linux system administration, software packaging, and delivery.
  • Solid understanding of Linux networking fundamentals, including firewalls, DNS, proxies, and best practices.
  • Experience managing and troubleshooting Kubernetes clusters in production.
  • Good understanding of the CNCF/cloud-native landscape and associated tools.
  • Familiarity with observability tools such as Prometheus and Grafana.
  • Basic scripting skills (e.g., Bash, Python).
  • Familiarity with cloud platforms (e.g., AWS, GCP, Azure).
  • Interest in automation tools like Ansible, Terraform, or similar.
  • Exposure to CI/CD pipelines (e.g., GitHub Actions, Jenkins, GitLab CI).
  • Familiarity with microservice architectures, Serverless, and DevOps best practices.
  • Eagerness to learn and take on new challenges.
  • Strong problem-solving skills and a curious, analytical mindset.
  • Enthusiasm for building reliable, high-performance systems.
  • Team player with good communication skills.
  • Ability to quickly adapt to new programming languages, runtimes, and environments.
Perks:
  • Help revolutionize the future of cloud compute runtime.
  • Work alongside a high-energy, top-notch, technical, and entrepreneurial team.
  • Make impactful contributions and help shape our rapidly growing company.
  • Gain deep hands-on experience with infrastructure and modern DevOps practices.
  • Collaborate with OS veterans, kernel hackers, and distributed systems experts.
  • Founder-led, product-obsessed, and deeply technical environment.
  • Work on groundbreaking technology.
  • Generous equipment budget to spend on anything you need.
  • Fully remote, fully flexible work.
  • Fun-focused team retreats and other events.
  • Competitive Salary.
  • 6 weeks vacation.
  • Development opportunities.

Add these skills to join the top 1% applicants for this job

communication
problem-solving
team-player
budget-management
github
game-texts
gitlab
networking
dns
linux
aws
azure
kvm
prometheus
ansible
terraform
grafana
ci-cd
kubernetes
python
github-actions
bash
jenkins

Join a world-class team reshaping cloud infrastructure from the ground up. Work on cutting-edge tech, learn from OS veterans, operate at scale—fully remote, deeply technical, and zero bureaucracy.

Join Unikraft!

We usually respond within a day

The cloud is broken: It's wasteful, slow, awfully expensive, and burdened with legacy tech that wasn't built for today's workloads. At Unikraft we're building a generational, truly millisecond-native, extremely scalable cloud platform that provides exponentially higher efficiency. Are you bored with your current job? Want to push the boundaries of what's possible in the cloud to the absolute limit?

Our team consists of some of the best systems, performance, and security geeks out there, and is backed by top investors with category leaders as our customers. We believe a focused team of exceptional people, moving fast with conviction, can rebuild the cloud from first principles and make extreme efficiency (eg, millions of users on a few servers) available to everyone.

What You’ll Do

  • Maintain and operate customer on-prem and cloud deployments of our platform, ensuring reliability and rapid troubleshooting of technical issues.
  • Plan, package, and roll out software updates both internally and to customers, including testing and validation.
  • Collaborate with engineering to ensure quality deployments and maintain a high standard of product reliability.
  • Set up and manage monitoring systems to proactively detect and resolve issues in production environments.
  • Write scripts and automation for deployment, infrastructure management, and CI/CD workflows.
  • Deploy, manage, and troubleshoot Kubernetes clusters for reliable, scalable infrastructure.
  • Build tooling and automation to streamline deployment and platform integration.
  • Contribute to continuous integration pipelines that catch regressions across components and system integrations.
  • Create and maintain clear documentation for systems, processes, and tools to support team effectiveness.

What We’re Looking For

  • At least 2 years of experience working in high-pressure environments.
  • Proven experience in Linux system administration, software packaging, and delivery.
  • Solid understanding of Linux networking fundamentals, including firewalls, DNS, proxies, and best practices.
  • Experience managing and troubleshooting Kubernetes clusters in production.
  • Good understanding of the CNCF/cloud-native landscape and associated tools.
  • Familiarity with observability tools such as Prometheus and Grafana.
  • Basic scripting skills (e.g., Bash, Python).
  • Familiarity with cloud platforms (e.g., AWS, GCP, Azure).
  • Interest in automation tools like Ansible, Terraform, or similar.
  • Exposure to CI/CD pipelines (e.g., GitHub Actions, Jenkins, GitLab CI).
  • Familiarity with microservice architectures, Serverless, and DevOps best practices.
  • Familiarity with virtualization solutions like QEMU/KVM. Micro-VMMs like Cloud-Hypervisor or Firecracker are a plus.

Mindset

  • Eagerness to learn and take on new challenges.
  • Strong problem-solving skills and a curious, analytical mindset.
  • Enthusiasm for building reliable, high-performance systems.
  • Team player with good communication skills.
  • Ability to quickly adapt to new programming languages, runtimes, and environments.

Why This Role is Career-Defining

  • Help revolutionize the future of cloud compute runtime while embracing continuously evolving modern technologies.
  • Work alongside a high-energy, top-notch, technical, and entrepreneurial team.
  • Make impactful contributions and help shape our rapidly growing company.
  • Gain deep hands-on experience with infrastructure and modern DevOps practices while learning from experienced engineers.

Why You’ll Love This Team

World-class Engineering: Collaborate with OS veterans, kernel hackers, and distributed systems experts.

No Crud: Founder-led, product-obsessed, and deeply technical. The best tech argument wins!

Groundbreaking Technology: Our tech powers the future of cloud infrastructure – come build it.

Build your Favorite Work Set up: A generous equipment budget to spend on anything you need to do your best work.

Fully Remote, Fully Flexible: Work from your favorite place, work at your favorite and most productive times.

Retreats, Game Nights and More: Fun-focused team retreats and other events to recharge and build great relationships.

The Standard Stuff: Competitive Salary, 6 weeks vacation, development opportunities.

Set alerts for more jobs like Site Reliability Engineer (SRE)
Set alerts for new jobs by Unikraft
Set alerts for Devops (Remote) jobs

Contact Us
hello@outscal.com
Made in INDIA 💛💙