Staff Cloud Infrastructure Engineer

Stem, Inc

Job Summary

Stem is seeking a Staff Cloud Infrastructure Engineer to design, build, and operate highly reliable, real-time backend systems that power critical energy infrastructure. This role focuses on large-scale distributed systems, cloud-native infrastructure, and operational excellence in always-on, mission-critical environments. The engineer will own and enhance distributed systems for data ingestion, streaming, analytics, and control pipelines, supporting grid-scale energy operations and integrating with industrial control protocols.

Must Have

  • Design, build, and operate highly available cloud infrastructure supporting real-time, mission-critical workloads.
  • Own and enhance distributed systems that power data ingestion, streaming, analytics, and control pipelines.
  • Support real-time data streaming, alerting, and analytics platforms used for grid-scale energy operations.
  • Design and operate systems that integrate with SCADA and industrial control protocols (e.g., Modbus, DNP3).
  • Build and maintain comprehensive observability solutions, including metrics, logging, and distributed tracing.
  • Participate in on-call and pager-duty rotations, ensuring high reliability and rapid incident response.
  • Collaborate closely with product, data, and application engineering teams to deliver scalable and resilient solutions.
  • Drive architectural decisions with a focus on scalability, performance, security, and operational excellence.
  • 8+ years of experience in cloud infrastructure, backend systems, or distributed systems engineering.
  • Strong programming experience in Java and Python.
  • Deep understanding of distributed systems principles, including consistency models, fault tolerance, and scalability.
  • Hands-on production experience operating Kubernetes-based, containerized workloads.
  • Experience with real-time and streaming data platforms (e.g., Spark, Flink).
  • Solid experience with data storage technologies: Relational (SQL) databases, NoSQL datastores, Search platforms such as Elasticsearch.
  • Proven experience supporting mission-critical, always-on production systems.
  • Strong background in designing and operating monitoring and observability platforms.

Good to Have

  • Experience with SCADA systems or industrial communication protocols (Modbus, DNP3).
  • Background in energy systems, grid infrastructure, or industrial IoT environments.
  • Experience with data visualization and analytics platforms such as Grafana or Power BI.
  • Experience building or operating low-latency, high-throughput data pipelines.
  • Prior experience supporting regulated or safety-critical systems.
  • Hands-on experience with AWS cloud services and infrastructure.

Perks & Benefits

  • A competitive compensation package, including eligibility for a bonus or commission based on the role, and equity.
  • Full health benefits on the first day of employment (several medical plan options-HDHP and PPO, dental plans, FSA/HSA-with employer contribution, employer paid vision/LTD/STD/Life, variety of voluntary coverage).
  • 401k (pre- or post-tax) on first day of employment.
  • 12 paid calendar holidays per year.
  • Flexible time-off.

Job Description

Role Overview

Stem is seeking a Staff Cloud Infrastructure Engineer to design, build, and operate highly reliable, real-time backend systems that power critical energy infrastructure. This role focuses on large-scale distributed systems, cloud-native infrastructure, and operational excellence in always-on, mission-critical environments.

Key Responsibilities

  • Design, build, and operate highly available cloud infrastructure supporting real-time, mission-critical workloads
  • Own and enhance distributed systems that power data ingestion, streaming, analytics, and control pipelines
  • Support real-time data streaming, alerting, and analytics platforms used for grid-scale energy operations
  • Design and operate systems that integrate with SCADA and industrial control protocols (e.g., Modbus, DNP3)
  • Build and maintain comprehensive observability solutions, including metrics, logging, and distributed tracing
  • Participate in on-call and pager-duty rotations, ensuring high reliability and rapid incident response
  • Collaborate closely with product, data, and application engineering teams to deliver scalable and resilient solutions
  • Drive architectural decisions with a focus on scalability, performance, security, and operational excellence

Required Qualifications

  • B.S./M.S. in Computer Science or related field, or equivalent experience.
  • 8+ years of experience in cloud infrastructure, backend systems, or distributed systems engineering
  • Strong programming experience in Java and Python; C++ experience is a plus
  • Deep understanding of distributed systems principles, including consistency models, fault tolerance, and scalability
  • Hands-on production experience operating Kubernetes-based, containerized workloads
  • Experience with real-time and streaming data platforms (e.g., Spark, Flink)
  • Solid experience with data storage technologies, including:
  • Relational (SQL) databases
  • NoSQL datastores
  • Search platforms such as Elasticsearch
  • Proven experience supporting mission-critical, always-on production systems
  • Strong background in designing and operating monitoring and observability platforms

Preferred Qualifications

  • Experience with SCADA systems or industrial communication protocols (Modbus, DNP3)
  • Background in energy systems, grid infrastructure, or industrial IoT environments
  • Experience with data visualization and analytics platforms such as Grafana or Power BI
  • Experience building or operating low-latency, high-throughput data pipelines
  • Prior experience supporting regulated or safety-critical systems
  • Hands-on experience with AWS cloud services and infrastructure

Core Technologies and Platforms

  • Java, Python (C++ a plus)
  • Kubernetes and cloud-native infrastructure
  • Real-time streaming and processing frameworks (Spark, Flink)
  • SQL databases, NoSQL datastores, and Elasticsearch
  • Observability platforms (metrics, logging, tracing)
  • Grafana, Power BI
  • SCADA and industrial control system integrations

Salary Range

$145,360.00 - $218,040.00

What We Offer:

At Stem, you will work in a growing, innovative, mission-driven company with talented colleagues that have a passion for building renewable energy systems. Stem offers competitive compensation as well as a comprehensive set of benefits to support the health and wellness of our employee including:

  • A competitive compensation package, including eligibility for a bonus or commission based on the role, and equity
  • Full health benefits on the first day of employment (several medical plan options-HDHP and PPO, dental plans, FSA/HSA-with employer contribution, employer paid vision/LTD/STD/Life, variety of voluntary coverage)
  • 401k (pre- or post-tax) on first day of employment
  • 12 paid calendar holidays per year
  • Flexible time-off

15 Skills Required For This Role

Data Analytics Cpp Game Texts Incident Response Aws Nosql Data Visualization Grafana Elasticsearch Power Bi Spark Kubernetes Python Sql Java

Similar Jobs