We are seeking a highly motivated and skilled Full Stack Cloud Operation Specialist to join our dynamic team. This role will be responsible for ensuring the reliability, performance, and security of our cloud-based infrastructure and applications. The ideal candidate will possess a strong understanding of both cloud operations and software development principles, enabling them to effectively troubleshoot issues across the full technology stack. They will play a crucial role in maintaining our production environment, implementing automation, and providing technical support to internal teams.
- Monitor and maintain the health, performance, and availability of our cloud infrastructure and applications (primarily on AWS).
- Troubleshoot and resolve incidents, perform root cause analysis, and implement preventative measures.
- Good understanding of infrastructure-as-code (IaC) developed in Terraform and CloudFormation.
- Collaborate with development teams to deploy and support new applications and features in the cloud.
- Contribute to the design and implementation of scalable, resilient, and secure cloud architectures.
- Manage and optimize cloud resources to ensure cost-effectiveness.
- Document operational procedures, troubleshooting steps, and system configurations.
- Stay up-to-date with the latest cloud technologies and best practices.
- Working with development, security, and other teams to ensure smooth operations.
Experience:
- 6+ years of experience in cloud operations, DevOps, or a similar role.
- Hands-on experience with cloud services (e.g., compute, storage, networking, databases).
- Experience with infrastructure-as-code tools (e.g., Terraform, CloudFormation).
- Experience with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack, CloudWatch, Datadog).
- Familiarity with CI/CD pipelines (e.g., Jenkins, GitLab CI/CD, GitHub Actions).
- Experience with scripting languages (e.g., Python, Bash).
- Experience with containerization technologies is a plus.
- Proficiency in Linux distributions such as Ubuntu, CentOS, Red Hat Enterprise Linux (RHEL), or Debian.
- Strong command of Linux command line and shell scripting.
- Understanding of networking principles and protocols.
Skills:
- Strong troubleshooting and problem-solving skills.
- Excellent understanding of cloud computing concepts and architectures.
- Solid understanding of networking principles.
- Ability to work independently and as part of a team.
- Strong communication and documentation skills.