Senior Site Reliability Engineer

2 Hours ago • 3 Years +

Job Summary

Job Description

The Senior Site Reliability Engineer will be responsible for the full system lifecycle including infrastructure provisioning, system configuration, monitoring, and incident response in production environments. This role involves ensuring high application performance and availability by collaborating with development teams, operations teams, network engineers, and vendors. The SRE will guide incident responses, identify root causes, and provide solutions to mitigate and resolve issues. This position requires working with AWS, containerized infrastructure (Kubernetes, EKS, ECS), implementing configuration management and automation solutions and performing 24/7 on-call duties.
Must have:
  • 3+ years in high-traffic SaaS environments.
  • Deep expertise in high availability processes and tools.
  • Skills to build automated cloud orchestration on AWS.
  • Experience with containerized infrastructure.
  • Experience with configuration management and automation.
  • Strong working knowledge of Linux.
  • Solid scripting skills (Bash, Python).
  • Experience with performance diagnostics and tuning.
  • BS in Computer Science or equivalent.
  • Good verbal and written communication skills.
Good to have:
  • Building PCI compliant systems.
  • Working with infrastructure for payment processing systems.
  • Developing high-volume transaction systems
Perks:
  • Make an Impact: Be part of a mission-driven organization.
  • Innovative Environment: Work in a fast-paced atmosphere.
  • Collaborative Team: Join a fun and collaborative team.
  • Competitive Benefits: Enjoy competitive pay and benefits.
  • Holistic Support: Enjoy financial assistance and family planning support.
  • Growth Opportunities: Participate in learning and development.
  • Commitment to DEI: Contribute to diversity, equity, and inclusion.
  • Community Engagement: Make a difference through volunteering.

Job Details

Want to help us help others? We’re hiring! 

GoFundMe is the world’s most powerful community for good, dedicated to helping people help each other. By uniting individuals and nonprofits in one place, GoFundMe makes it easy and safe for people to ask for help and support causes—for themselves and each other. Together, our community has raised more than $40 billion since 2010.

Come join us! The GoFundMe team is searching for our next Site Reliability Engineer (SRE). You will be responsible for the full system lifecycle including infrastructure provisioning, system configuration, monitoring, and incident response in production environments. The SRE uses technical analysis to assess the availability, latency, scalability, and efficiency of a product or infrastructure and builds reliability into systems. To ensure the highest level of application performance and availability, the reliability engineer works closely with development teams, relevant functional operations teams, network engineers, database administrators, technology vendors and partners. The successful reliability engineer effectively guides incident responses, helps identify root causes and provides recommendations or solutions to mitigate and resolve issues.

Candidates considered for this role will be located in San Diego, CA. There will be an in-office requirement of 2-3x a week.

The Job

  • Design and build out our cloud infrastructure (we run everything in AWS).
  • Participate in software and system performance analysis, tuning, and service capacity planning.
  • Manage the availability, scalability, security, and performance of our platform and applications.
  • Diagnose bottlenecks for the full stack and provide recommendations to overcome the bottlenecks as an interim work around, while long-term solutions are investigated.
  • Periodically assess all monitoring requirements and implement enhancements to meet or exceed changing business needs.
  • Proactively review, recommend, and implement changes to the live infrastructure after ensuring the right validation has been carried out.
  • Work across engineering to improve SLO/SLI framework
  • Use data analysis to pick up trends before they become major problems.
  • Perform 24/7 on-call duties.

You

  • 3+ years of experience in operating high-traffic SaaS environments.
  • Deep expertise in the mentality, processes, and tools needed to deliver high availability.
  • Skills to build a fully automated, highly elastic cloud orchestration framework on AWS.
  • Experience running containerized infrastructure in Production (Kubernetes using EKS, AWS ECS)
  • Experience implementing configuration management and automation solutions using Infrastructure as Code, CI/CD and GitOps (Ansible, Terraform, ArgoCD, Github Actions)
  • Strong working knowledge of Linux and its underlying components, system statistics, performance tuning, filesystems and IO.
  • Solid scripting skills (e.g. Bash, Python).
  • Experience with performance diagnostics, performance tuning, capacity planning, and monitoring.
  • BS in Computer Science or equivalent.
  • Good verbal and written communication skills.

Preferred 

  • Building PCI compliant systems
  • Working with infrastructure for payment processing systems
  • Developing high-volume transaction systems
  • Passion for building fault tolerant and secure platforms

Technologies you are likely to be working with

AWS, Docker, Kubernetes, ECS, Helm, ArgoCD, CloudFlare, Terraform, Ansible, MySQL/Aurora, Nginx, Loft, Devspace, Elasticsearch, Kafka, Redis, Github, Bash, Python, PHP, Java, Kotlin, Sumologic, NewRelic, PagerDuty

Why you’ll love it here

  • Make an Impact: Be part of a mission-driven organization making a positive difference in millions of lives every year.
  • Innovative Environment: Work with a diverse, passionate, and talented team in a fast-paced, forward-thinking atmosphere.
  • Collaborative Team: Join a fun and collaborative team that works hard and celebrates success together.
  • Competitive Benefits: Enjoy competitive pay and comprehensive healthcare benefits.
  • Holistic Support: Enjoy financial assistance for things like hybrid work, family planning, along with generous parental leave, flexible time-off policies, and mental health and wellness resources to support your overall well-being.
  • Growth Opportunities: Participate in learning, development, and recognition programs to help you thrive and grow.
  • Commitment to DEI: Contribute to diversity, equity, and inclusion through ongoing initiatives and employee resource groups.
  • Community Engagement: Make a difference through our volunteering and Gives Back programs.

We live by our core values: impatient to be great, find a way, earn trust every day, fueled by purpose. Be a part of something bigger with us!

GoFundMe is proud to be an equal opportunity employer that actively pursues candidates of diverse backgrounds and experiences.  We do not discriminate on the basis of race, color, religion, ethnicity, nationality or national origin, sex, sexual orientation, gender, gender identity or expression, pregnancy status, marital status, age, medical condition, mental or physical disability, or military or veteran status.

The total annual salary for this full-time position is $128,500 - $192,500 + equity + benefits. The salary range was determined by role, level, and possible location across the US. Individual pay is determined by work location and additional factors including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary range based on your location during the hiring process.

If you require a reasonable accommodation to complete a job application or a job interview or to otherwise participate in the hiring process, please contact us at accommodationrequests@gofundme.com

Global Data Privacy Notice for Job Candidates and Applicants:

Depending on your location, the General Data Protection Regulation (GDPR) or certain US privacy laws may regulate the way we manage the data of job applicants. Our full notice outlining how data will be processed as part of the application procedure for applicable locations is available here. By submitting your application, you are agreeing to our use and processing of your data as required. 

Learn more about GoFundMe:

We’re proud to partner with GoFundMe.org, an independent public charity, to extend the reach and impact of our generous community, while helping drive critical social change. You can learn more about GoFundMe.org’s activities and impact in their FY ‘24 annual report.

Our annual “Year in Help” report reflects our community’s impact in advancing our mission of helping people help each other.

For recent company news and announcements, visit our Newsroom.

#LI-CL1

#LI-HYBRID

Similar Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Similar Skill Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Jobs in San Diego, California, United States

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Similar Category Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

About The Company

San Diego, California, United States (Hybrid)

San Francisco, California, United States (Hybrid)

Warsaw, Masovian Voivodeship, Poland (Remote)

Buenos Aires, Buenos Aires, Argentina (On-Site)

San Francisco, California, United States (On-Site)

Chicago, Illinois, United States (Hybrid)

San Diego, California, United States (Hybrid)

Chicago, Illinois, United States (Hybrid)

San Diego, California, United States (Hybrid)

Buenos Aires, Buenos Aires, Argentina (On-Site)

View All Jobs

Get notified when new jobs are added by Go Fund Me

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug