Site Reliability Engineer II

3 Hours ago • 3 Years +

Job Summary

Job Description

The Site Reliability Engineer (SRE) will be responsible for the full system lifecycle including infrastructure provisioning, system configuration, monitoring, and incident response in production environments. They will work closely with development teams, operations teams, network engineers, database administrators, technology vendors, and partners to ensure application performance and availability. The SRE will guide incident responses, identify root causes, and provide solutions to mitigate and resolve issues. This role requires experience in high-traffic SaaS environments and expertise in delivering high availability. The SRE will also design and build cloud infrastructure, participate in performance analysis and capacity planning, manage platform scalability, and implement monitoring enhancements.
Must have:
  • 3+ years of experience in operating high-traffic SaaS environments
  • Deep expertise in delivering high availability
  • Skills to build a fully automated cloud orchestration framework on AWS
  • Experience with containerized infrastructure in Production (Kubernetes, EKS, ECS)
  • Experience implementing configuration management solutions using Infrastructure as Code
  • Strong working knowledge of Linux
  • Solid scripting skills (e.g. Bash, Python)
  • Experience with performance diagnostics and monitoring
Good to have:
  • Building PCI compliant systems
  • Working with infrastructure for payment processing systems
  • Developing high-volume transaction systems

Job Details

Want to help us help others? We’re hiring! 

GoFundMe is the world’s most powerful community for good, dedicated to helping people help each other. By uniting individuals and nonprofits in one place, GoFundMe makes it easy and safe for people to ask for help and support causes—for themselves and each other. Together, our community has raised more than $40 billion since 2010.

Come join us! The GoFundMe team is searching for our next Site Reliability Engineer (SRE). You will be responsible for the full system lifecycle including infrastructure provisioning, system configuration, monitoring, and incident response in production environments. The SRE uses technical analysis to assess the availability, latency, scalability, and efficiency of a product or infrastructure and builds reliability into systems. To ensure the highest level of application performance and availability, the reliability engineer works closely with development teams, relevant functional operations teams, network engineers, database administrators, technology vendors and partners. The successful reliability engineer effectively guides incident responses, helps identify root causes and provides recommendations or solutions to mitigate and resolve issues.

Candidates considered for this role will be located in Buenos Aires, Argentina. There will be an in-office requirement of 2-3x a week.

The Job

  • Design and build out our cloud infrastructure (we run everything in AWS).
  • Participate in software and system performance analysis, tuning, and service capacity planning.
  • Manage the availability, scalability, security, and performance of our platform and applications.
  • Diagnose bottlenecks for the full stack and provide recommendations to overcome the bottlenecks as an interim work around, while long-term solutions are investigated.
  • Periodically assess all monitoring requirements and implement enhancements to meet or exceed changing business needs.
  • Proactively review, recommend, and implement changes to the live infrastructure after ensuring the right validation has been carried out.
  • Work across engineering to improve SLO/SLI framework
  • Use data analysis to pick up trends before they become major problems.
  • Perform 24/7 on-call duties.

You

  • 3+ years of experience in operating high-traffic SaaS environments.
  • Deep expertise in the mentality, processes, and tools needed to deliver high availability.
  • Skills to build a fully automated, highly elastic cloud orchestration framework on AWS.
  • Experience running containerized infrastructure in Production (Kubernetes using EKS, AWS ECS)
  • Experience implementing configuration management and automation solutions using Infrastructure as Code, CI/CD and GitOps (Ansible, Terraform, ArgoCD, Github Actions)
  • Strong working knowledge of Linux and its underlying components, system statistics, performance tuning, filesystems and IO.
  • Solid scripting skills (e.g. Bash, Python).
  • Experience with performance diagnostics, performance tuning, capacity planning, and monitoring.
  • BS in Computer Science or equivalent.
  • Good verbal and written communication skills.

Preferred 

  • Building PCI compliant systems
  • Working with infrastructure for payment processing systems
  • Developing high-volume transaction systems
  • Passion for building fault tolerant and secure platforms

Technologies you are likely to be working with

AWS, Docker, Kubernetes, ECS, Helm, ArgoCD, CloudFlare, Terraform, Ansible, MySQL/Aurora, Nginx, Loft, Devspace, Elasticsearch, Kafka, Redis, Github, Bash, Python, PHP, Java, Kotlin, Sumologic, NewRelic, PagerDuty

Why you’ll love it here

  • Make an Impact: Be part of a mission-driven organization making a positive difference in millions of lives every year.
  • Innovative Environment: Work with a diverse, passionate, and talented team in a fast-paced, forward-thinking atmosphere.
  • Collaborative Team: Join a fun and collaborative team that works hard and celebrates success together.
  • Competitive Benefits: Enjoy competitive pay and comprehensive healthcare benefits.
  • Holistic Support: Enjoy financial assistance for things like hybrid work, family planning, along with generous parental leave, flexible time-off policies, and mental health and wellness resources to support your overall well-being.
  • Growth Opportunities: Participate in learning, development, and recognition programs to help you thrive and grow.
  • Commitment to DEI: Contribute to diversity, equity, and inclusion through ongoing initiatives and employee resource groups.
  • Community Engagement: Make a difference through our volunteering and Gives Back programs.

We live by our core values: impatient to be great, find a way, earn trust every day, fueled by purpose. Be a part of something bigger with us!

GoFundMe is proud to be an equal opportunity employer that actively pursues candidates of diverse backgrounds and experiences.  We do not discriminate on the basis of race, color, religion, ethnicity, nationality or national origin, sex, sexual orientation, gender, gender identity or expression, pregnancy status, marital status, age, medical condition, mental or physical disability, or military or veteran status.

Individual pay is determined by work location and additional factors including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary range based on your location during the hiring process. 

If you require a reasonable accommodation to complete a job application or a job interview or to otherwise participate in the hiring process, please contact us at accommodationrequests@gofundme.com

Global Data Privacy Notice for Job Candidates and Applicants:

Depending on your location, the General Data Protection Regulation (GDPR) or certain US privacy laws may regulate the way we manage the data of job applicants. Our full notice outlining how data will be processed as part of the application procedure for applicable locations is available here. By submitting your application, you are agreeing to our use and processing of your data as required. 

Learn more about GoFundMe:

We’re proud to partner with GoFundMe.org, an independent public charity, to extend the reach and impact of our generous community, while helping drive critical social change. You can learn more about GoFundMe.org’s activities and impact in their FY ‘24 annual report.

Our annual “Year in Help” report reflects our community’s impact in advancing our mission of helping people help each other.

For recent company news and announcements, visit our Newsroom.

#LI-KM1

#LI-HYBRID

Similar Jobs

Synechron - Senior Automation Tester (Core Java & API Testing)

Synechron

Pune, Maharashtra, India (On-Site)
1 Week ago
Veeam Software - Platform Engineer, SaaS

Veeam Software

(Remote)
2 Days ago
Saama Technologies,  Inc  - NLP Engineer

Saama Technologies, Inc

(Remote)
2 Months ago
Zones LLC - Sales Leader / Client Director

Zones LLC

London, England, United Kingdom (On-Site)
10 Months ago
Enverus - Senior Software Engineer

Enverus

Brno, South Moravian Region, Czechia (On-Site)
1 Day ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Palo Alto Networks - Sr Principal FinOps/DevOps Engineer (Cortex)

Palo Alto Networks

Santa Clara, California, United States (On-Site)
2 Weeks ago
Drive mode - QA Engineer

Drive mode

Tokyo, Japan (Hybrid)
1 Month ago
Kojima - IT Support Engineer

Kojima

Minato City, Tokyo, Japan (On-Site)
2 Weeks ago
London stock Exchange - Principal Engineer - UI

London stock Exchange

London, England, United Kingdom (On-Site)
3 Days ago
PwC - AI/ML Azure Engineer (m/f/d)

PwC

Luxembourg (On-Site)
7 Months ago
Voodoo - Senior Data Engineer (Ad networks - Core)

Voodoo

Paris, Île-de-France, France (Hybrid)
2 Months ago
ION - Site Reliability Engineer

ION

Milan, Lombardy, Italy (Hybrid)
7 Months ago
Scopely - Senior Server Engineer (Platform)

Scopely

Lisbon, Lisbon, Portugal (Hybrid)
2 Months ago
GoDaddy - Full Stack Software Engineer - AWS

GoDaddy

(Remote)
2 Weeks ago
INTEL - DevOps Engineer Intern (AI Solutions)

INTEL

Gdańsk, Pomeranian Voivodeship, Poland (Hybrid)
1 Week ago

Get notifed when new similar jobs are uploaded

Jobs in Buenos Aires, Buenos Aires, Argentina

The Walt Disney Company - Analyst, Growth & Retention Analytics

The Walt Disney Company

Buenos Aires, Buenos Aires, Argentina (On-Site)
2 Weeks ago
Google - Analytical Lead, Large Customer Sales

Google

Buenos Aires, Buenos Aires, Argentina (On-Site)
3 Weeks ago
Scale AI - QA Engineer, Generative AI

Scale AI

Argentina (On-Site)
7 Months ago
Evolution - Office Coordinator

Evolution

Buenos Aires, Buenos Aires, Argentina (On-Site)
10 Months ago
Evolution - Payroll Specialist

Evolution

Buenos Aires, Buenos Aires, Argentina (On-Site)
9 Months ago
Evolution - Head of Studio

Evolution

Buenos Aires, Buenos Aires, Argentina (On-Site)
10 Months ago
Evolution - Game Presenter

Evolution

Buenos Aires, Buenos Aires, Argentina (On-Site)
2 Months ago
Google - Senior Deal Strategy and Operations Associate, Global Partnerships

Google

Buenos Aires, Buenos Aires, Argentina (On-Site)
3 Weeks ago
Evolution - Equipment Support Specialist (On-site on Canning, Ezeiza)

Evolution

Buenos Aires, Buenos Aires, Argentina (On-Site)
11 Months ago
Miratech - Senior Java Developer

Miratech

Argentina (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded

Similar Category Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

About The Company

Buenos Aires, Buenos Aires, Argentina (On-Site)

Buenos Aires, Buenos Aires, Argentina (On-Site)

Buenos Aires, Buenos Aires, Argentina (Hybrid)

San Francisco, California, United States (Hybrid)

Buenos Aires, Buenos Aires, Argentina (On-Site)

San Francisco, California, United States (Hybrid)

View All Jobs

Get notified when new jobs are added by GoFundMe

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug