Site Reliability Engineer (SRE 2) - Azure Focused
Granicus is seeking a motivated and detail-oriented Mid-Level Site Reliability Engineer (SRE 2) to join our dynamic IT team. As an SRE, you will work closely with our software engineers to ensure the reliability, availability, and performance of our services. This is an excellent opportunity for someone looking to develop their career in site reliability engineering further and gain hands-on experience with cutting-edge technologies.
Essential Function
- On-call Production Support: Provide production support on a shift according to the team on-call roster.
- Work on the customer and internal engineering/implementation team raised tickets while not on-call for production support. For example, a client may request to correct some data on the database server which cannot be done through the web interface.
- Work on SREs backlog items.
- Monitor and Maintain Systems: Continuously monitor the health and performance of our services, systems, and infrastructure. Respond to alerts and incidents promptly to ensure high availability.
- Automate Processes: Develop and maintain automation scripts and tools to streamline operations and reduce manual intervention.
- Incident Management: Assist in troubleshooting and resolving incidents, performing root cause analysis, and implementing long-term fixes to prevent recurrence.
- System Improvements: Participate in the design and implementation of system improvements to enhance reliability, scalability, and performance.
- Collaboration: Work closely with software engineers to understand application requirements, provide feedback on design and architecture, and support deployment and release processes.
- Documentation: Create and maintain documentation for processes, procedures, and troubleshooting guides to ensure knowledge sharing within the team.
- Capacity Planning: Assist in capacity planning activities to anticipate future needs and ensure that our infrastructure can handle growth.
- Security: Implement and adhere to security best practices to protect our systems and data.
Knowledge/Skills/Abilities
- Technical Skills: Good understanding of Linux/Unix systems, networking, and cloud services (AWS, Azure, or Google Cloud). Experience with scripting languages such as Python, Bash, or Ruby.
- Tools and Technologies: Experience with monitoring and logging tools (e.g., Prometheus, Grafana, Splunk), version control systems (e.g., Git), and CI/CD pipelines.
- Problem-Solving: Strong analytical and problem-solving skills with a proactive approach to identifying and addressing issues.
- Communication: Excellent verbal and written communication skills, with the ability to work effectively in a team environment.
- Learning Mindset: Eagerness to learn new technologies and improve existing skills. Openness to receiving feedback and applying it to improve performance.
Preferred Qualifications
- Experience: Internships or academic projects involving system administration, cloud services, or software development.
- Experience/Credentials:
- Minimum Four years experience in a SRE, Devops and Production support
- Certifications: Relevant certifications such as AWS Certified Solutions Architect, Google Cloud Professional DevOps Engineer, or similar
Other Job Info
- These statements are intended to describe the general nature and level of work being performed by employees assigned to this job. This is not intended to be an exhaustive list of all responsibilities, duties, and skills required of employees assigned to this job.
- This role is typically performed on a computer using Zoom or Teams. Individuals will be on camera throughout the day engaging with other employees. The role is typically performed indoors within a home office environment. This role is typically performed while sitting or standing at a desk. The individual will occasionally lift light objects.
Academic Qualifications and Certifications:
- Bachelor’s degree in computer science, Information Technology, or a related field, or equivalent practical experience
Shift Time
- The position requires flexibility in working hours to cover for any overlap and attend team meetings as needed.
- Shift Time: rotational shift, including weekends (typically two week every month)
Security Requirement:
- Responsible for Granicus information security by appropriately preserving the Confidentiality, Integrity, and Availability (CIA) of Granicus information assets in accordance with the company's information security program.
Security and Privacy Requirements
- Responsible for Granicus information security by appropriately preserving the Confidentiality, Integrity, and Availability (CIA) of Granicus information assets in accordance with the company's information security program.
- Responsible for ensuring the data privacy of our employees and customers, their data, as well as taking all required privacy training in a timely manner, in accordance with company policies.
The Team
- We are a remote-first company with a globally distributed workforce across the United States, Canada, United Kingdom, India, Armenia, Australia, and New Zealand.
The Culture
- At Granicus, we are building a transparent, inclusive, and safe space for everyone who wants to be a part of our journey.
- A few culture highlights include – Employee Resource Groups to encourage diverse voices
- Coffee with Mark sessions – Our employees get to interact with our CEO on very important and sometimes difficult issues ranging from mental health to work-life balance and current affairs.
- Microsoft Teams communities focused on wellness, art, furbabies, family, parenting, and more.
- We bring in special guests from time to time to discuss issues that impact our employee population
The Impact
- We are proud to serve dynamic organizations around the globe that use our digital solutions to make the world a better place — quite literally. We have so many powerful success stories that illustrate how our solutions are impacting the world. See more of our impact here.