Site Reliability Engineer (Java/GoLang/Python coding)

2 Minutes ago • 5 Years + • Devops

Job Summary

Job Description

SailPoint is a leader in identity security, providing solutions that secure and enable thousands of companies by ensuring workers have the right access. The Site Reliability Engineer will join the Reliability Engineering team for IdentityNow, SailPoint’s IDaaS product. This role involves building and running distributed systems at global scale, focusing on analyzing problems, innovating solutions, and collaborating to build reliable, scalable, and impactful systems. The team writes software to solve scalability, observability, security, reliability, and operability challenges.
Must have:
  • Make it easy to create, consume, manage, and scale reliable cloud production services.
  • Design, develop, and improve end-to-end reliability and maintainability for SailPoint SaaS services.
  • Coach engineering teams on observability best practices and Service Level Objectives (SLOs).
  • Lead engineering teams through post-incident reviews to define effective preventive actions.
  • Collaborate with developers to increase system reliability through short-term embedding programs.
  • Enable engineering teams to scale enterprise operations by providing guidance and support.
  • Manage cross-functional requirements with Engineering, Product, Services, and other departments.
  • Develop and implement automation tools and processes to streamline operations and enhance system performance.
  • Mentor on quality for design reviews, code, test cases, automation, observability, root cause analysis, and self-healing.
  • Influence architectural design, implementation, consolidation, and simplification for global scale.
  • Focus on expanding own skills and improving teammates' skills.
  • Drive operational excellence for frictionless operation, happy on call, and optimal customer experience.
  • 5+ years experience in SRE or DevOps production operations supporting highly available SaaS/cloud environments.
  • Experience with cloud infrastructure environments, preferably AWS, and Infrastructure as code, preferably Terraform.
  • Experience with containerization technology and/or Kubernetes.
  • Experience with metrics, tracing, and logging observability tools like Prometheus, Grafana, Honeycomb, Kibana.
  • Experience with incident management, including conducting incident reviews.
  • Experience with programming languages (Java, Python, Go, etc.).
  • Strong understanding of Linux, software development, systems, networking, and Cloud concepts.
  • Strong interpersonal and teaming skills to set process and influence engineers.
  • Excellent communication skills, English fluency of C1 or higher preferred.
Good to have:
  • Bachelor's degree in Computer Science or other technical discipline, or equivalent experience

Job Details

SailPoint is the leader in identity security for the cloud enterprise. Our identity security solutions secure and enable thousands of companies worldwide, giving our customers unmatched visibility into the entirety of their digital workforce, ensuring workers have the right access to do their job – no more, no less.

IdentityNow is SailPoint’s Identity as a Service (IDaaS) product, and the Site Reliability Engineer will be a key player on our Reliability Engineering team servicing the IdentityNow product suite. We are looking for engineers with broad experience in building and running distributed systems at global scale. If you enjoy analyzing complicated problems, innovating creative solutions, and collaborating across teams to build reliable, scalable, and impactful solutions, come join our Reliability Engineering team. We are a team of people that write software to solve scalability, observability, security, reliability, and operability problems.

What You’ll Make Happen:

  • Make it easy for everyone to create, consume, manage, and scale reliable cloud production services to achieve more.
  • Work independently or collaboratively on SailPoint SaaS services to design, develop, and improve end-to-end reliability and maintainability for all services
  • Coach engineering teams on observability best practices such as setting up well defined Service Level Objectives (SLOs).
  • Lead engineering teams through post-incident reviews to define effective preventive actions.
  • Collaborate effectively with developers to increase system reliability through short-term embedding programs.
  • Enable our engineering teams to scale our enterprise operations by providing guidance, best practices and support as part of an SRE Center of Excellence
  • Manage cross-functional requirements working with Engineering, Product, Services, and other departments.
  • Develop and implement automation tools and processes to streamline operations and enhance system performance.
  • Be a mentor of quality for design reviews, code, test cases, automation, observability, root cause analysis, and self-healing.
  • Influence architectural design, implementation, consolidation, and simplification for global scale
  • Focuses on expanding own skills and looking at improving their teammates' skills..
  • Drive operational excellence to deliver frictionless operation, happy on call, and optimal customer experience.

Requirements

  • 5+ years experience in SRE or DevOps production operations supporting a highly available environment for SaaS software or cloud service provider.
  • Experience with cloud infrastructure environments, preferably AWS, and Infrastructure as code, preferably Terraform.
  • Experience with containerization technology and/or Kubernetes.
  • Experience with metrics, tracing, and logging observability tools such as Prometheus, Grafana, Honeycomb, and Kibana.
  • Experience with incident management, including conducting incident reviews.
  • Experience with programming languages (Java, Python, Go, etc). Strong understanding of Linux, software development, systems, networking, and Cloud concepts.
  • Strong interpersonal and teaming skills - ability to set and enforce process and influence engineers who are not direct reports.
  • Have excellent communication skills- English fluency of C1 or higher preferred
  • Bachelor's degree in Computer Science or other technical discipline, or equivalent experience is preferred, not required.

Within the first 30 days you will:

  • Onboard into your new role, get familiar with our product offering and technology stack.
  • If applicable, come up to speed on Identity Access Management space.
  • Get to know your peers, leaders and other engineers to understand current state, challenges and motivations.
  • Get to understand the current state of our reliability practices

By 90 days:

  • Contribute to the technical architecture of our reliability and capacity planning practices, providing architectural ideas.
  • Look beyond the immediate backlog to create and share a forward-thinking technical vision for your team.
  • You are prioritizing projects and defining scopes of work and developing solutions.

By 6 months:

  • You are regularly mentoring and coaching members of your team.
  • You own multiple significant projects.
  • You provide technical leadership to your team while also delivering high quality code on your own.
  • Lead significant and thoughtful critiques of others’ design document.
  • Consistently achieve targets and meet deadlines; you ensure the quality of deliverables exceeds expectations.
  • You are flexing your lifelong learning muscles, staying abreast of emerging external technologies to understand when to introduce them to your team.

Similar Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Similar Skill Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Jobs in India

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Devops Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

About The Company

SailPoint is a leading provider of identity security for the modern enterprise. Enterprise security starts and ends with identities and their access, yet the ability to manage and secure identities today has moved well beyond human capacity. Using a foundation of artificial intelligence and machine learning, the SailPoint Identity Security Platform delivers the right level of access to the right identities and resources at the right time—matching the scale, velocity, and environmental needs of today’s cloud-oriented enterprise.

Illinois, United States (Remote)

United States (On-Site)

Singapore (Remote)

Austin, Texas, United States (Remote)

Austin, Texas, United States (Hybrid)

South Korea (Remote)

Austin, Texas, United States (Remote)

View All Jobs

Get notified when new jobs are added by Sailpoint

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug