We are seeking a skilled and experienced DevOps Engineer Tech Lead to join our team, responsible for the management and optimization of our cloud environments across AWS, GCP.
This role will primarily focus on building and maintaining cloud infrastructures for Cloud WAF development teams.
The ideal candidate will possess deep knowledge of various cloud platforms and will be responsible for ensuring smooth operation and scalability of development and production environments.
Responsibilities:
Work with development teams to ensure that applications have scalability and reliability built-in from day one- agile is second nature to you and you’re excited to work in scrum teams and represent the DevOps perspective
Participate in software design discussion to improve scalability, service reliability, cost, and performance- you’ve helped create services that are critical to their customers’ success
Deploy automation for provisioning and operating infrastructure at large scale. You are experienced in Infrastructure as Code concepts and have put them into production
Partner with teams to improve CI/CD processes and technology - Helping teams in delivering value early is what you strive for
Mentor members of the team on large scale cloud deployments- you’re an expert in deploying in the cloud and can bring a teaching mindset to help others benefit from your experience
Drive the adoption of observability practices and a data-driven mindset- you love metrics, graphs, and gaining a deep understanding of why things happen in a system, helping others gain visibility into the things they build
Setup processes like on-call rotations and runbooks to continue supporting the MS’s owned by the development teams while finding ways to reduce the time to resolution and improve the reliability of services
Required Qualifications:
Bachelors/Masters degree in Computer Science
14+ years of industry experience in engineering
7+ years of working with Microservices architectures on Kubernetes
HandsOn experience with container native tools like Helm, Istio, Vault running in Kubernetes
Experience with public cloud AWS at medium to large scale
Proficient in CI/CD platforms like GitlabCI, Jenkins, etc…
Drive enhancement of observability by implementing distributed tracing, logging standards, dashboard standardization, profiling, and other relevant practices to meet our Service Level Objectives (SLOs)
HandOn experience with Monitoring tools - Prometheus, Grafana etc.
Expertise in designing, analyzing, and troubleshooting large-scale distributed systems
Experience with, Kafka, Postgres, MYSQL tuning and performance is a plus
Desired Qualifications:
Fluent Scripting skills preferably Python or Bash
Experience with public cloud GCP at medium to large scale
Knowledge of operating systems (processes, threads, concurrency, etc)
Experience working with Unix/Linux systems from kernel to shell and beyond