Kubernetes Platform Engineer

1 Month ago • 2-4 Years
Devops

Job Description

As a Kubernetes Platform Engineer at TensorWave, you will maintain the stability and reliability of our bare-metal Kubernetes infrastructure. This role involves troubleshooting, incident response, and day-to-day cluster operations across multi-tenant workloads, supporting cutting-edge AI environments. You will work closely with senior engineers to deepen your Kubernetes expertise and contribute to the next generation of AI innovation.
Good To Have:
  • Experience with RKE2, Rancher, or similar platforms
  • Experience troubleshooting or supporting AI or GPU-based workloads
  • Familiarity with HAProxy, Cilium, or other Kubernetes ingress/networking tools
Must Have:
  • Own and troubleshoot operational issues within Kubernetes environments
  • Maintain and monitor core services (e.g., Cilium, HAProxy, Prometheus, etc.)
  • Ensure uptime, performance, and reliability of multi-tenant clusters
  • Assist with Ingress/Egress connectivity and network debugging
  • Support internal and customer teams in secure, isolated VPC environments
  • Collaborate with senior engineers on automation and cluster lifecycle improvements
  • 2–4 years experience in DevOps, SRE, or Linux infrastructure roles
  • 1+ years of hands-on experience with Kubernetes in production
  • Familiarity with networking, CNI plugins, and core Linux troubleshooting
  • Strong infrastructure-as-code mindset using tools like Helm, Terraform, or Ansible
  • Solid experience with monitoring and logging tools (e.g., Prometheus, Grafana, Loki)
  • Understanding of secure infrastructure design principles and least-privilege access
  • Comfortable working in a team-oriented, fast-paced operational environment
Perks:
  • Stock Options
  • 100% paid Medical, Dental, and Vision insurance
  • Life and Voluntary Supplemental Insurance
  • Short Term Disability Insurance
  • Flexible Spending Account
  • 401(k)
  • Flexible PTO
  • Paid Holidays
  • Parental Leave
  • Mental Health Benefits through Spring Health

Add these skills to join the top 1% applicants for this job

problem-solving
game-texts
networking
linux
incident-response
prometheus
ansible
terraform
grafana
rancher
helm
kubernetes

At TensorWave, we're leading the charge in AI compute, building a versatile cloud platform that's driving the next generation of AI innovation. We're focused on creating a foundation that empowers cutting-edge advancements in intelligent computing, pushing the boundaries of what's possible in the AI landscape.

About the Role:

As a Kubernetes Platform Engineer focused on support and operations, you’ll play a critical role in maintaining the stability and reliability of our bare-metal Kubernetes infrastructure. You will work closely with senior engineers, taking point on troubleshooting, incident response, and day-to-day cluster operations across multi-tenant workloads.

This is a great opportunity for engineers ready to deepen their Kubernetes expertise while supporting cutting-edge AI environments in real-time.

Responsibilities:

  • Own and troubleshoot operational issues within Kubernetes environments
  • Maintain and monitor core services (e.g., Cilium, HAProxy, Prometheus, etc.)
  • Ensure uptime, performance, and reliability of multi-tenant clusters
  • Assist with Ingress/Egress connectivity and network debugging
  • Support internal and customer teams in secure, isolated VPC environments
  • Collaborate with senior engineers on automation and cluster lifecycle improvements

Required Skills & Experience:

  • 2–4 years experience in DevOps, SRE, or Linux infrastructure roles
  • 1+ years of hands-on experience with Kubernetes in production
  • Familiarity with networking, CNI plugins, and core Linux troubleshooting
  • Strong infrastructure-as-code mindset using tools like Helm, Terraform, or Ansible
  • Solid experience with monitoring and logging tools (e.g., Prometheus, Grafana, Loki)
  • Understanding of secure infrastructure design principles and least-privilege access
  • Comfortable working in a team-oriented, fast-paced operational environment

Nice to Have:

  • Experience with RKE2, Rancher, or similar platforms
  • Experience troubleshooting or supporting AI or GPU-based workloads
  • Familiarity with HAProxy, Cilium, or other Kubernetes ingress/networking tools

What We Bring:

In addition to a competitive salary, we offer a variety of benefits to support your needs, including:

  • Stock Options
  • 100% paid Medical, Dental, and Vision insurance
  • Life and Voluntary Supplemental Insurance
  • Short Term Disability Insurance
  • Flexible Spending Account
  • 401(k)
  • Flexible PTO
  • Paid Holidays
  • Parental Leave
  • Mental Health Benefits through Spring Health

Set alerts for more jobs like Kubernetes Platform Engineer
Set alerts for new jobs by TensorWave
Set alerts for new Devops jobs in United States
Set alerts for new jobs in United States
Set alerts for Devops (Remote) jobs

Contact Us
hello@outscal.com
Made in INDIA 💛💙