DevOps / Triage Engineer
Sailpoint
Job Summary
As a DevOps / Triage Engineer on the Infrastructure and Platform Services team, you will be responsible for DevOps and Triage work in the UAE region for SailPoint's Identity Security Cloud platform. This role involves being the first line of defense for complex technical issues, applying changes to the production environment, diagnosing and resolving problems in a SaaS-based microservices architecture, and collaborating with engineering, product, and DevOps teams to improve platform stability and performance while meeting data sovereignty requirements.
Must Have
- Physically located in the UAE.
- Responsible for DevOps and Triage work in the UAE region.
- First line of defense for complex technical issues.
- Apply changes to the production environment in UAE.
- Diagnose and resolve problems in SaaS microservices architecture.
- Collaborate with engineering, product, and DevOps teams.
- Experience in 24x7 production operations.
- Experience with AWS, Kubernetes, and ArgoCD.
- Strong understanding of microservices architecture and debugging distributed systems.
- Proficient in scripting languages (Python, Bash, PowerShell) and infrastructure-as-code tools (Terraform).
- Strong understanding of system and networking concepts and troubleshooting techniques.
Job Description
- This is a remote position; however, you will also have access to an office for team gatherings and planning sessions as needed. Due to the requirements of UAE customers, this position requires being physically located in the UAE. In-person interviews will be required.
About the role:
As a DevOps / Triage Engineer you will be a key player on the Infrastructure and Platform Services team servicing the Identity Security Cloud platform. You will be responsible for DevOps and Triage work in our UAE region. You will proactively work with Engineering, Product, Services, and other functional departments to implement and operate our global customer-facing SaaS infrastructure.
The ideal candidate will be a self-starter who enjoys a fast-paced job, thrives on problem solving, and is committed to delivering seamless product availability to large enterprises around the world.
You will be a member of the UAE DevOps team that performs triage and DevOps work for our product running in the UAE region. You will be the first line of defense for complex technical issues escalated from Support Level 1 & 2 teams. Your role will combine applying changes to our production environment in UAE and to diagnosing, reproducing, and driving resolutions for challenging problems in a SaaS-based microservices architecture. You will collaborate with engineering, product, and DevOps teams to improve the stability and performance of our cloud platform. You will partner with leaders at all levels across the organization on making our product work while meeting data sovereignty requirements.
Due to the requirements of UAE customers, this position requires being physically located in the UAE.
About the team:
The UAE DevOps team will work together to implement, maintain, and support our products in the UAE region in AWS. The team will provide hands on implementation and support in AWS in the UAE region for our SaaS products.
Roadmap for success:
Within first 30 days:
- Onboard into your new role
- Learn about our product offering and technology
- Proactively meet peers and stakeholders
- Set up your test and development environment.
- Learn the tools used for DevOps work
- Begin learning how to investigate and triage customer reported issues
By 90 days:
- Collaborate on projects with team members and other teams
- Be fully integrated with the team, join the on-call rotation to support our product
- Work closely with Support, Engineering, and DevOps teams to escalate and resolve critical incidents efficiently.
- Collaborate with Support to gather additional information and guide them through issue resolution.
- Contribute to root cause identification and help implement long-term fixes.
- Suggest process improvements to enhance troubleshooting efficiency.
- Maintain and enhance knowledge base documentation to improve internal and customer-facing support resources.
- Investigate and triage complex customer-reported issues by analyzing logs, debugging, and reproducing errors.
- Develop a technical understanding of Identity Security Cloud and its microservices-based architecture.
By 6 months:
- Investigate and triage complex customer-reported issues by analyzing logs, debugging, and reproducing errors.
- Independently handle any on-call incidents
- Collaborate with other teams to bring new features and services into production in the UAE region
- Proactively meet standards for information security and compliance (such as ISO, SOC, SSAE 16, etc.)
Requirements:
Background & Experience:
- Experience in 24x7 production operations, preferably supporting a highly available environment for a SaaS or cloud service provider
- Experience with cloud infrastructure environments, preferably AWS
- Experience with Kubernetes and tools like ArgoCD
- Strong understanding of microservices architecture and debugging distributed systems.
- Passion and expertise working with engineering, product, and support teams
- Knowledge of current development strategies and trends across web and mobile platforms
- Release automation (Jenkins, etc), system administration, system configuration, and system debugging experience
- Experience using scripting languages (Python, Bash, PowerShell, etc.) and infrastructure-as-code tools (Terraform)
- Strong understanding of system and networking concepts and troubleshooting techniques
- Strong interpersonal and teaming skills - ability to set and enforce process and influence engineers who are not direct reports.
- Ability to operate in an agile, entrepreneurial start-up environment.
Education (preferred, not required):
- Bachelor's degree in Computer Science or other technical discipline, or equivalent experience