Your Mission
Working for the teams who make and deliver our software, you’ll make sure services are effective across test, development and production environments.
It’ll be your job to drive improvements, whether they relate to boosting enjoyment of the software or the efficiency of our teams. You’ll be empowered to improve software, using your analytical and diagnostic flair to the full. And, above all, you’ll make certain our agile methodology isn’t hampered by operational requirements, or delays in delivery or software transitions.
You’ll be part of a super-motivated DevOps team whose mission is to create new and awesome tools that will empower our development teams to do things in an automated and safer way. Whilst standardizing processes to ensure all development teams are aligned with each other.
What you'll do
- Plan upgrades of large, complex systems across a variety of environments, working closely with key staff across multiple sites.
- Manage technical releases including software deploys, de-risking deploys and change request reviews.
- Investigate and diagnose problems, assign or carry out code and deployment fixes, and keep documents up to date.
- Make workflow more efficient and the system more stable.
- Build and verify new servers, including tracking network changes.
- Build tools for automate, simplify and de-risk processes.
- Work with bleeding-edge tools and technologies.
- Administer more than 50 Kubernetes clusters with hundreds of services.
What you'll bring
- Familiar with containerization (Docker, rkt) and virtualization (VMWare, Vagrant, VirtualBox) technologies.
- Strong experience working with Kubernetes clusters and observability tools such as Kibana, Logstash, Prometheus and Grafana. Familiar with Service Mesh, ideally using Istio.
- Knowledge in Public and Private clouds: Google Cloud Platform, AWS, Azure, OpenStack, VMWare.
- Experience with HashiCorp stack: Terraform, Consul and Vault. Ability to understand and implement Infrastructure-as-Code principles.
- A strong and deep level of understanding of Continuous Integration and Deployment concepts and tools, including SCMs (Git, Subversion), CI servers (Bamboo, Jenkins), build tools (Maven, Gradle), binary repositories (Nexus, Artifactory) and code quality tools (SonarQube). Experience with the Atlassian stack is desirable.
- Strong knowledge around platform automation technologies, such as Puppet and Ansible. Experience with AWX is a plus.
- Good level of understanding regarding database administration, including but not limited to, Cassandra, Neo4j and ElasticSearch as well as the traditional ones (MySQL, Oracle, etc).
- Equipped with basic programming know-how – including an understanding of code and coding concepts – plus advanced knowledge of scripting languages (Bash, Python, Perl, Ruby...). Knowledge in Django framework is a plus.
- Familiar with software development, including concepts, current technologies and frameworks.
- A talented Linux administrator who knows about network diagnostics and services.
- Basic knowledge in Windows systems administration and investigation, with expertise that covers Event log and Services.
- Flexible enough to meet tight deadlines and driven to deliver to strict SLAs.