Cloud Operations Engineer
extreme network
Job Summary
Extreme Networks is seeking a Cloud Operations Engineer in Canada to manage and maintain its cloud service infrastructure across AWS, GCP, and Azure. The role involves participating in continuous cloud service operations with global teams, troubleshooting production issues, and performing root cause analysis. Responsibilities include collaborating with development and QA teams, managing release deployments, system maintenance, and cloud expansion. A key aspect is designing and implementing deployment automation for Kubernetes-based microservices and improving service availability, scalability, monitoring, and security. The ideal candidate will have a strong background in CloudOps/DevOps, public cloud experience, knowledge of Linux, security, networking fundamentals, and container-based architectures (Docker, Kubernetes), along with experience in deployment automation tools, diagnosing complex application problems, and working with monitoring and messaging tools.
Must Have
- Manage and maintain ExtremeCloud service infrastructure in AWS, GCP & Azure.
- Troubleshoot and follow up on production infrastructure/application issues.
- Design and implement deployment automation platform for Kubernetes.
- Improve service availability and scalability through tuning and automation.
- 5+ years of experience in CloudOps/DevOps.
- Hands-on experience with AWS or any public cloud.
- Knowledge of Linux, security and networking fundamentals.
- Working knowledge of Docker, Kubernetes.
- Strong follow-through and initiative.
Good to Have
- Participate in continuous cloud service operations with US, EU, and China teams.
- Communicate with Dev/QA as well as external carriers.
- Analyze service performance and identify bottlenecks.
- Improve service monitoring coverage, accuracy and efficiency.
- Participate in cloud security and compliance implementation.
- Working knowledge of Argo Workflows, Terraform, Helm.
- Experience in diagnosing and resolving complex application problems.
- Working knowledge of Elasticsearch, PostgreSQL, Redis, Ignite, Kafka, RabbitMQ.
- Experience with monitoring tools (Nagios, Kibana, Prometheus).
- Experience with cloud security and compliance implementation.
- Comfortable working within a distributed team in multiple time zones.
Job Description
Responsibilities:
- Manage and maintain ExtremeCloud service infrastructure in AWS, GCP & Azure.
- Participate in continuous cloud service operations with US, EU, and China teams.
- Troubleshoot and follow up on production infrastructure / application related issues.
- Driving root cause analysis and resolution.
- Communicate with Dev/QA as well as external carriers to resolve and prevent issues.
- Participate in release deployment, system maintenance and cloud expansion.
- Design and implement deployment automation platform for Kubernetes based microservices.
- Improve service availability and scalability through tuning, automation, tools, and process.
- Analyze service performance, identify bottleneck and provide actionable improvement plans.
- Improve service monitoring coverage, accuracy and efficiency.
- Participate in cloud security and compliance implementation.
Ideal Qualifications:
- BS level technical degree required; Computer Science or Engineering background preferred.
- 5+ years of experience in a CloudOps / DevOps role.
- Hands on experience with AWS or any public cloud (Azure, GCP etc).
- Knowledge of Linux, security and networking fundamentals.
- Working knowledge of container-based architecture and deployment (Docker, Kubernetes.)
- Working knowledge of deployment automation development (Argo Workflows, Terraform, Helm).
- Experience in diagnosing and resolving complex application problems.
- Working knowledge of Elasticsearch, PostgreSQL, Redis, Ignite, Kafka and RabbitMQ.
- Experience with monitoring tools (Nagios, Kibana, Prometheus)
- Experience with cloud security and compliance implementation is a plus.
- Strong follow-through and initiative to stay with issues until they are resolved.
- Comfortable working within a distributed team located in multiple time zones.