Kubernetes Platform Engineer

42 Minutes ago • 2-4 Years

Job Summary

Job Description

As a Kubernetes Platform Engineer at TensorWave, you will maintain the stability and reliability of our bare-metal Kubernetes infrastructure. This role involves troubleshooting, incident response, and day-to-day cluster operations across multi-tenant workloads, supporting cutting-edge AI environments. You will work closely with senior engineers to deepen your Kubernetes expertise and contribute to the next generation of AI innovation.
Must have:
  • Own and troubleshoot operational issues within Kubernetes environments
  • Maintain and monitor core services (e.g., Cilium, HAProxy, Prometheus, etc.)
  • Ensure uptime, performance, and reliability of multi-tenant clusters
  • Assist with Ingress/Egress connectivity and network debugging
  • Support internal and customer teams in secure, isolated VPC environments
  • Collaborate with senior engineers on automation and cluster lifecycle improvements
  • 2–4 years experience in DevOps, SRE, or Linux infrastructure roles
  • 1+ years of hands-on experience with Kubernetes in production
  • Familiarity with networking, CNI plugins, and core Linux troubleshooting
  • Strong infrastructure-as-code mindset using tools like Helm, Terraform, or Ansible
  • Solid experience with monitoring and logging tools (e.g., Prometheus, Grafana, Loki)
  • Understanding of secure infrastructure design principles and least-privilege access
  • Comfortable working in a team-oriented, fast-paced operational environment
Good to have:
  • Experience with RKE2, Rancher, or similar platforms
  • Experience troubleshooting or supporting AI or GPU-based workloads
  • Familiarity with HAProxy, Cilium, or other Kubernetes ingress/networking tools
Perks:
  • Stock Options
  • 100% paid Medical, Dental, and Vision insurance
  • Life and Voluntary Supplemental Insurance
  • Short Term Disability Insurance
  • Flexible Spending Account
  • 401(k)
  • Flexible PTO
  • Paid Holidays
  • Parental Leave
  • Mental Health Benefits through Spring Health

Job Details

At TensorWave, we're leading the charge in AI compute, building a versatile cloud platform that's driving the next generation of AI innovation. We're focused on creating a foundation that empowers cutting-edge advancements in intelligent computing, pushing the boundaries of what's possible in the AI landscape.

About the Role:

As a Kubernetes Platform Engineer focused on support and operations, you’ll play a critical role in maintaining the stability and reliability of our bare-metal Kubernetes infrastructure. You will work closely with senior engineers, taking point on troubleshooting, incident response, and day-to-day cluster operations across multi-tenant workloads.

This is a great opportunity for engineers ready to deepen their Kubernetes expertise while supporting cutting-edge AI environments in real-time.

Responsibilities:

  • Own and troubleshoot operational issues within Kubernetes environments
  • Maintain and monitor core services (e.g., Cilium, HAProxy, Prometheus, etc.)
  • Ensure uptime, performance, and reliability of multi-tenant clusters
  • Assist with Ingress/Egress connectivity and network debugging
  • Support internal and customer teams in secure, isolated VPC environments
  • Collaborate with senior engineers on automation and cluster lifecycle improvements

Required Skills & Experience:

  • 2–4 years experience in DevOps, SRE, or Linux infrastructure roles
  • 1+ years of hands-on experience with Kubernetes in production
  • Familiarity with networking, CNI plugins, and core Linux troubleshooting
  • Strong infrastructure-as-code mindset using tools like Helm, Terraform, or Ansible
  • Solid experience with monitoring and logging tools (e.g., Prometheus, Grafana, Loki)
  • Understanding of secure infrastructure design principles and least-privilege access
  • Comfortable working in a team-oriented, fast-paced operational environment

Nice to Have:

  • Experience with RKE2, Rancher, or similar platforms
  • Experience troubleshooting or supporting AI or GPU-based workloads
  • Familiarity with HAProxy, Cilium, or other Kubernetes ingress/networking tools

What We Bring:

In addition to a competitive salary, we offer a variety of benefits to support your needs, including:

  • Stock Options
  • 100% paid Medical, Dental, and Vision insurance
  • Life and Voluntary Supplemental Insurance
  • Short Term Disability Insurance
  • Flexible Spending Account
  • 401(k)
  • Flexible PTO
  • Paid Holidays
  • Parental Leave
  • Mental Health Benefits through Spring Health

Similar Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Similar Skill Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Jobs in Las Vegas, Nevada, USA

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Similar Category Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

About The Company

Las Vegas, Nevada, United States (On-Site)

Las Vegas, Nevada, United States (On-Site)

Las Vegas, Nevada, United States (On-Site)

Las Vegas, Nevada, United States (On-Site)

Las Vegas, Nevada, United States (On-Site)

Las Vegas, Nevada, United States (On-Site)

Las Vegas, Nevada, United States (On-Site)

Las Vegas, Nevada, United States (On-Site)

Las Vegas, Nevada, United States (On-Site)

View All Jobs

Get notified when new jobs are added by TensorWave

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug