Senior Site Reliability Engineer (AWS, AI/ML, & APM)

2 Months ago • 5 Years + • Devops • $70,000 PA - $80,000 PA

Job Summary

Job Description

As a Senior Site Reliability Engineer (SRE) at Granicus, you will ensure the reliability, scalability, and performance of their services. Responsibilities include providing on-call production support, working on tickets, monitoring and maintaining systems, automating processes, incident management, system improvements, collaboration with software engineers, documentation, capacity planning, and security. This role requires managing large-scale, high-availability systems and supporting AI/ML infrastructure. Granicus is transforming the Govtech industry by bringing governments and its constituents together and has been consistently recognized as one of the best companies to work for.
Must have:
  • 5+ years in SRE or similar with large-scale system management
  • Expertise in Linux/Unix, cloud platforms (AWS, Azure, or Google Cloud)
  • Strong scripting skills (Python, Bash, Ruby) and programming (Go, Java, C++)
Good to have:
  • Experience with ELK Stack for logging, monitoring, and observability
  • Experience with configuration management tools (Ansible, Chef, Puppet)
  • Exposure to AI/ML toolchains like AWS Bedrock and SageMaker
Perks:
  • Flexible Time Off
  • Medical, Dental & Vision Insurance
  • 401(k) plan with matching contribution
  • Paid Parental Leave
  • Employer-paid Short and Long Term Disability Insurance
  • Group Term Life Insurance and AD&D Insurance
  • Group legal coverage

Job Details

The Company 

Serving the People Who Serve the People 


Granicus is driven by the excitement of building, implementing, and maintaining technology that is transforming the Govtech industry by bringing governments and its constituents together. We are on a mission to support our customers with meeting the needs of their communities and implementing our technology in ways that are equitable and inclusive. Granicus has consistently appeared on the GovTech 100 list over the past 5 years and has been recognized as the best companies to work on BuiltIn.  


Over the last 25 years, we have served 5,500 federal, state, and local government agencies and more than 300 million citizen subscribers power an unmatched Subscriber Network that use our digital solutions to make the world a better place. With comprehensive cloud-based solutions for communications, government website design, meeting and agenda management software, records management, and digital services, Granicus empowers stronger relationships between government and residents across the U.S., U.K., Australia, New Zealand, and Canada. By simplifying interactions with residents, while disseminating critical information, Granicus brings governments closer to the people they serve—driving meaningful change for communities around the globe. 

Want to know more? See more of what we do here.  


Granicus​ is seeking an experienced and highly skilled Senior Site Reliability Engineer (SRE) to join our SRE team. As a Senior SRE, you will play a pivotal role in ensuring the reliability, scalability, and performance of our services. You will lead efforts in building and maintaining a robust infrastructure, automating processes, and guiding the team to implement best practices in site reliability. 


What your impact will look like:
  • ​​On-call Production Support: Provide production support on a shift according to the team on-call roster. 
  • ​Work on the customer and internal engineering/implementation team raised tickets while not on-call for production support. For example, a client may request to correct some data on the database server which cannot be done through the web interface.  
  • ​Work on SREs backlog items.  
  • ​Monitor and Maintain Systems: Continuously monitor the health and performance of our services, systems, and infrastructure. Respond to alerts and incidents promptly to ensure high availability. 
  • ​Automate Processes: Develop and maintain automation scripts and tools to streamline operations and reduce manual intervention. 
  • ​Incident Management: Assist in troubleshooting and resolving incidents, performing root cause analysis, and implementing long-term fixes to prevent recurrence. 
  • ​System Improvements: Participate in designing and implementing system improvements to enhance reliability, scalability, and performance. 
  • ​Collaboration: Work closely with software engineers to understand application requirements, provide feedback on design and architecture, and support deployment and release processes. 
  • ​Documentation: Create and maintain documentation for processes, procedures, and troubleshooting guides to ensure knowledge sharing within the team. 
  • ​Capacity Planning: Assist in capacity planning activities to anticipate future needs and ensure that our infrastructure can handle growth. 
  • ​Security: Implement and adhere to security best practices to protect our systems and data.​ 


Experience:
  • 5+ years in site reliability engineering, system administration, or a similar role, with a proven track record of managing large-scale, high-availability systems. Experience supporting AI/ML infrastructure, including model deployment, inference optimization, and integration with services like AWS Bedrock is highly desirable.


Technical Skills:
  • Expertise in Linux/Unix systems, and cloud platforms (AWS, Azure, or Google Cloud).
  • Strong proficiency in scripting languages (Python, Bash, Ruby) and programming languages (Go, Java, C++).
  • Familiarity with AI/ML operations, including model lifecycle management, vector databases, and inference performance tuning.


Tools and Technologies:
  • Experience with the ELK Stack (Elasticsearch, Logstash, Kibana) for centralized logging, monitoring, and observability.
  • Experience with configuration management tools (Ansible, Chef, Puppet).
  • Exposure to AI/ML toolchains, including AWS Bedrock, SageMaker, and LLMOps frameworks.
  • Certifications: Relevant certifications such as AWS Certified DevOps Engineer, AWS Certified Machine Learning – Specialty, Google Cloud Professional DevOps Engineer, or similar are a plus.


$70,000 - $80,000 a year
+ bonus and benefits

 

Don’t have all the skills/experience mentioned above? At Granicus, we are trying to build diverse, inclusive teams. We do not have degree requirements for most of our roles. If you don’t meet every requirement above but are excited to learn more, we encourage you to apply. We might just be able to find another role that could be a perfect fit! 


Security and Privacy Requirements

-     Responsible for Granicus information security by appropriately preserving the Confidentiality, Integrity, and Availability (CIA) of Granicus information assets in accordance with the company's information security program.

-     Responsible for ensuring the data privacy of our employees and customers, their data, as well as taking all required privacy training in a timely manner, in accordance with company policies.

The Team

- We are a remote-first company with a globally distributed workforce across the United States, Canada, United Kingdom, India, Armenia, Australia, and New Zealand.


The Culture

- At Granicus, we are building a transparent, inclusive, and safe space for everyone who wants to be

a part of our journey.

- A few culture highlights include – Employee Resource Groups to encourage diverse voices

- Coffee with Mark sessions – Our employees get to interact with our CEO on very important and

sometimes difficult issues ranging from mental health to work-life balance and current affairs. 

- Microsoft Teams communities focused on wellness, art, furbabies, family, parenting, and more.-=- - We bring in special guests from time to time to discuss issues that impact our employee

population 


The Impact

- We are proud to serve dynamic organizations around the globe that use our digital solutions to make the world a better place — quite literally. We have so many powerful success stories that illustrate how our solutions are impacting the world. See more of our impact here.


The Benefits 


At Granicus, we offer a competitive benefits package that allows employees to tailor benefits to their needs. Benefits listed below are for employees based in the U.S.


- Flexible Time Off

- Medical (includes an option that is paid 100% by Granicus!), Dental & Vision Insurance

- 401(k) plan with matching contribution

- Paid Parental Leave

- Employer-paid Short and Long Term Disability Insurance, Group Term Life Insurance and AD&D Insurance

- Group legal coverage 

- And more!


 

Granicus is committed to providing equal employment opportunities. All qualified applicants and employees will be considered for employment and advancement without regard to race, color, religion, creed, national origin, ancestry, sex, gender, gender identity, gender expression, physical or mental disability, age, genetic information, sexual or affectional orientation, marital status, status with regard to public assistance, familial status, military or veteran status or any other status protected by applicable law. 

Similar Jobs

Kavalirio - Manufacturing Engineer Level I

Kavalirio

Los Angeles, California, United States (On-Site)
3 Months ago
Granicus - Solution Consultant

Granicus

United States (Remote)
2 Months ago
Stord - Area Manager

Stord

Delta, British Columbia, Canada (On-Site)
3 Weeks ago
Philips - Consumer Proposition and Insights - Senior Manager

Philips

Amsterdam, North Holland, Netherlands (Hybrid)
2 Weeks ago
The Walt Disney Company - Assistant Business Systems Manager, Shipboard

The Walt Disney Company

Singapore, Singapore (On-Site)
3 Months ago
Social Discovery Ventures - Senior DevOps

Social Discovery Ventures

Batumi, Adjara, Georgia (Remote)
3 Months ago
dbt Labs - Customer Solutions Architect

dbt Labs

Austin, Texas, United States (Hybrid)
3 Weeks ago
Sierra - Software Engineer, Platform

Sierra

San Francisco, California, United States (On-Site)
11 Months ago
extreme network - STAFF SW SYSTEMS ENGINEER - Platform Development- Kernel/Linux Driver

extreme network

Bengaluru, Karnataka, India (Hybrid)
2 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

NinjaVan - Intern, Finance

NinjaVan

Subang Jaya, Selangor, Malaysia (Hybrid)
10 Months ago
Stibo Systems - Treasury Manager

Stibo Systems

Aarhus, Denmark (Hybrid)
1 Month ago
Meow Wolf - Technical Director

Meow Wolf

Santa Fe, New Mexico, United States (On-Site)
3 Weeks ago
Games2win - Data Entry Specialist

Games2win

Mumbai, Maharashtra, India (On-Site)
3 Months ago
TVH - Paralegal

TVH

Waregem, Flanders, Belgium (Hybrid)
3 Weeks ago
Nasdaq - Sr. Director, ETF's

Nasdaq

New York, New York, United States (Hybrid)
2 Weeks ago
London stock Exchange - Manager, Central Compliance

London stock Exchange

London, England, United Kingdom (On-Site)
2 Months ago
sony global (Games) - 4K Camera and System Lecturer

sony global (Games)

Beijing, China (On-Site)
3 Months ago
Interactive Brokers - Junior Clearing Operations Associate

Interactive Brokers

Budapest, Hungary (Hybrid)
2 Months ago
Open Systems Technologies - Concierge

Open Systems Technologies

Baltimore, Maryland, United States (On-Site)
3 Weeks ago

Get notifed when new similar jobs are uploaded

Jobs in United States

CharacterAI - Research Engineer, Multimodal Audio

CharacterAI

Redwood City, California, United States (On-Site)
2 Months ago
Google - Program Manager II, Demand Planning, Cloud Supply Chain

Google

Atlanta, Georgia, United States (On-Site)
3 Months ago
world relief - Operations Specialist, Regional

world relief

Carol Stream, Illinois, United States (On-Site)
1 Month ago
Kyruus Health - Director, DevOps & Infrastructure

Kyruus Health

United States (Remote)
1 Month ago
plexonic games - Web / UI Designer

plexonic games

United States (On-Site)
3 Weeks ago
Ziff Davis - Business Developer, Games

Ziff Davis

Los Angeles, California, United States (Remote)
2 Weeks ago
Super.com - Senior Full-Stack Software Engineer ( Remote! )

Super.com

Austin, Texas, United States (Remote)
9 Months ago
onwards Search - Sr. Product Designer (Native Mobile)

onwards Search

Boston, Massachusetts, United States (Remote)
2 Weeks ago
Alten Technology - Senior Embedded Software Engineer

Alten Technology

Westminster, Colorado, United States (Hybrid)
2 Months ago
Wolters Kluwer - Digital Strategy Director

Wolters Kluwer

Kennesaw, Georgia, United States (Hybrid)
3 Weeks ago

Get notifed when new similar jobs are uploaded

Devops Jobs

BigID - Site Reliability Engineer

BigID

Hyderabad, Telangana, India (Hybrid)
2 Months ago
Salesforce - Senior Solution Architect

Salesforce

Bengaluru, Karnataka, India (Hybrid)
2 Weeks ago
Synechron - Java Spring Boot Automation Engineer

Synechron

Pune, Maharashtra, India (On-Site)
3 Weeks ago
Apple - Cloud Traffic Engineer, Apple Pay

Apple

New York, New York, United States (On-Site)
1 Month ago
Hitachi - AWS Infrastructure Engineer/Administrator

Hitachi

Bengaluru, Karnataka, India (Remote)
9 Months ago
Google - Senior Software Engineer, Google Cloud

Google

Bengaluru, Karnataka, India (On-Site)
7 Months ago
Unisys - Presales Solution Architect

Unisys

Hungary (On-Site)
3 Months ago
Devoteam - Software Architect

Devoteam

Frankfurt Am Main, Hessen, Germany (On-Site)
1 Month ago
Salesforce - Software Engineering PMTS/Architect

Salesforce

Bengaluru, Karnataka, India (Remote)
1 Year ago
Wind River - Member of Technical Staff - DevOps - Cloud

Wind River

Walnut Creek, California, United States (On-Site)
2 Weeks ago

Get notifed when new similar jobs are uploaded

About The Company

Bengaluru, Karnataka, India (Hybrid)

Costa Rica (Remote)

United States (Remote)

Costa Rica (Remote)

Texas, United States (Hybrid)

Bengaluru, Karnataka, India (Hybrid)

Bengaluru, Karnataka, India (Hybrid)

Texas, United States (Hybrid)

United States (Remote)

View All Jobs

Get notified when new jobs are added by Granicus

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug