SRE - Systems, Networking, Cloud & Development

Thales

3+ Years | Bangalore, Karnataka, India (Hybrid) | Full Time | 24 months ago

Apply Now

Job Summary

Thales is seeking a Site Reliability Engineer (SRE) with expertise in Systems, Networking, Cloud, and Development. The role involves applying SRE principles like measurement, toil elimination, and reliability modeling. You will educate development teams on best practices, architect infrastructure solutions, and troubleshoot operational issues. Responsibilities include proactive data analysis, testing network/system integrity, resolving business-impacting issues, participating in escalations, incident response, RCA, blameless postmortems, and on-call rotations. The ideal candidate will have at least 3 years of experience in cloud/web/CDN scale infrastructure, proficiency in Python and Go, expert knowledge of Linux, networking protocols, and experience with DevOps principles, CI/CD, monitoring tools like Prometheus and Grafana, big data technologies, and container management (Docker, Kubernetes). Experience with C/C++, BGP, Anycast routing, and analyzing data telemetry is a plus.

Must Have

3+ years in cloud/web/CDN infrastructure
Experience with Python and Go
Expert Linux systems knowledge
Expert network programming and protocols
Experience with DevOps principles
Experience with CI/CD tools
Experience with monitoring tools
Experience with big data technologies
Experience with containers and orchestration
Experience with data telemetry and pipelines
Experience in software development and monitoring distributed systems
Experience with Agile methodologies
Team player, accountable for business urgency

Good to Have

C/C++ experience
BGP and Anycast routing experience
Infrastructure as Code experience
UI visualization experience

Perks & Benefits

Career development opportunities
Global mobility policy
Flexibility in working

Job Description

Location: Bangalore - Indraprastha, India

Thales people architect identity management and data protection solutions at the heart of digital security. Business and governments rely on us to bring trust to the billons of digital interactions they have with people. Our technologies and services help banks exchange funds, people cross borders, energy become smarter and much more. More than 30,000 organizations already rely on us to verify the identities of people and things, grant access to digital services, analyze vast quantities of information and encrypt data to make the connected world more secure.

Responsibilities

Apply SRE core tenets of measurement (SLI/SLO/SLA), eliminate toil, and reliability modeling
Enable and educate development teams on industry best practice design patterns, ways of working and operational knowledge to ensure platform continuity
Develop and architect solutions to infrastructure and operational aspects of new products and feature sets
Assist with go/no go preplanning, verification/validation, and review of existing and new product/services
Proactively analyze data and test the integrity of network/systems to ensure production applications and services are operating optimally
Work within development teams to troubleshoot and resolve business affecting issues
Escalations, incident response, RCA, and blameless postmortem
Participate in on-call rotation

Qualifications

At least 3 years of professional experience within a cloud/web/CDN scale infrastructure
Experience with Python and Go. C/C++ a plus
Expert knowledge of Linux systems, network programming and protocols TCP, UDP, DNS, TLS/SSL, HTTP
Experience with BGP and Anycast routing is a plus
Experience with DevOps principles and concepts such as Infrastructure as Code (Ansible/Saltstack), CI/CD (Gitlab, Jenkins, Git), monitoring and visualization (Prometheus, Grafana)
Experience with big data technologies such as NoSQL/RDBMS, Redis, ElasticSearch, Kafka
Experience with containers and container management (Docker, Kubernetes)
Experience analyzing and building data telemetry, modeling, pipelines, UI visualization
Experience in developing software, troubleshooting, and monitoring large scale distributed systems
Implement software engineering best practices/standards and software development life cycle
Working knowledge and experience of Agile software development methodologies
A strong team player who is accountable towards business urgency
Ability to stay organized in a multi-tasking environment
Self-starter personality

At Thales we provide CAREERS and not only jobs. With Thales employing 80,000 employees in 68 countries our mobility policy enables thousands of employees each year to develop their careers at home and abroad, in their existing areas of expertise or by branching out into new fields. Together we believe that embracing flexibility is a smarter way of working. Great journeys start here, apply now!

25 Skills Required For This Role

Problem Solving Team Player Data Analytics Design Patterns Github Cpp Game Texts Agile Development Gitlab Networking Dns Incident Response Linux Nosql Ansible Prometheus Grafana Elasticsearch Redis Ci Cd Docker Kubernetes Git Python Jenkins