Senior Site Reliability Engineer (SRE)

11 Months ago • 5 Years + • Devops

Job Summary

Job Description

As a Site Reliability Engineer (SRE) at Sword Health, you will be vital in maintaining the health and uptime of our services. You will collaborate with development teams to build and operate scalable and resilient systems, troubleshoot issues across the stack, and implement automation to reduce manual work. You'll be responsible for monitoring, incident management, automation, tooling, performance optimization, security, compliance, documentation, knowledge sharing, and database management. You'll be working with a team of over 900 talented colleagues across three continents to build a pain-free world, powered by AI, enhanced by people — accessible to all.
Must have:
  • Proficiency in programming languages like Python, Go, Javascript.
  • 5+ years of experience with cloud platforms such as AWS, Google Cloud, or Azure.
  • Strong understanding of Linux/Unix systems and networking.
  • Familiarity with containerization and orchestration tools (e.g., Docker, Kubernetes).
  • Experience with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack).
  • Knowledge of CI/CD pipelines and tools (e.g., Jenkins, GitLab CI).
  • Database Experience: Proficiency with relational and NoSQL databases (e.g., MySQL, PostgreSQL, Redis, Elasticsearch).
  • Team Player: Willingness to collaborate and share knowledge with colleagues to drive collective success.
  • Ownership: Taking responsibility for your work and demonstrating accountability for outcomes.
Good to have:
  • Innovative Mindset: A passion for exploring new technologies and methodologies to improve reliability and performance.
  • Proactive Approach: Ability to anticipate potential issues and implement preventive measures.
  • Continuous Improvement: A dedication to learning and growing in your role, staying updated with industry trends and best practices.
Perks:
  • A stimulating, fast-paced environment with lots of room for creativity;
  • A bright future at a promising high-tech startup company;
  • Career development and growth, with a competitive salary;
  • The opportunity to work with a talented team and to add real value to an innovative solution with the potential to change the future of healthcare;
  • A flexible environment where you can control your hours (remotely) with unlimited vacation; 
  • Access to our health and well-being program (digital therapist sessions);
  • Remote or Hybrid work policy (Portugal only);

Job Details

Sword Health is on a mission to free two billion people from pain. 


With 67% of members achieving a pain-free life and a 70% reduction in surgery intent, at Sword, we are using AI Care to change lives, and save millions for our 25,000+ enterprise clients across three continents. Today, we hold the majority of industry patents, win 70% of competitive evaluations, and have raised more than $300 million from top venture firms like Founders Fund, Sapphire Ventures, General Catalyst, and Khosla Ventures.


Recognized as a Forbes Best Startup Employer in 2025, this award highlights our focus on being a destination for the best and brightest  talent. Not only have we experienced unprecedented growth since our market debut in 2020,  but we’ve also created a remarkable mission and value-driven environment that is loved by our growing team. With a recent valuation of $3 billion, we are in a phase of hyper growth and expansion, and we’re looking for individuals with passion, commitment, and energy to help us scale our global impact. 


Joining Sword means committing to a set of core values, chief amongst them to “do it for the patients” every day, and to always “deliver more than expected” on behalf of our members and clients.


This is an opportunity for you to make a significant difference on a massive scale as you work alongside 900+ (and growing!) talented colleagues, spanning three continents. Your charge? To help us build a pain-free world, powered by AI, enhanced by people — accessible to all.



As a Site Reliability Engineer (SRE) at Sword Health, you will play a critical role in maintaining the health and uptime of our services. You will collaborate with development teams to build and operate scalable and resilient systems, troubleshoot issues across the stack, and implement automation to reduce manual work.


What you'll be doing:
  • Monitoring and Incident Management: Develop and maintain monitoring and alerting solutions. Respond to incidents, troubleshoot issues, and perform root cause analysis.
  • Automation and Tooling: Automate repetitive tasks and improve deployment processes. Develop and maintain tools to support infrastructure and applications.
  • Performance Optimization: Analyze system performance and implement optimizations to improve efficiency and reduce latency.
  • Security and Compliance: Ensure systems are secure and compliant with relevant standards and regulations.
  • Documentation and Knowledge Sharing: Maintain comprehensive documentation of systems and processes. Share knowledge and best practices with team members.
  • Database Management: Ensure the reliability, performance, and scalability of databases. Perform database optimization, maintenance, and troubleshooting.


What you need to have:
  • Proficiency in programming languages such as Python, Go, Javascript.
  • 5+ years of experience with cloud platforms such as AWS, Google Cloud, or Azure.
  • Strong understanding of Linux/Unix systems and networking.
  • Familiarity with containerization and orchestration tools (e.g., Docker, Kubernetes).
  • Experience with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack).
  • Knowledge of CI/CD pipelines and tools (e.g., Jenkins, GitLab CI).
  • Database Experience: Proficiency with relational and NoSQL databases (e.g., MySQL, PostgreSQL, Redis, Elasticsearch).
  • Team Player: Willingness to collaborate and share knowledge with colleagues to drive collective success.
  • Ownership: Taking responsibility for your work and demonstrating accountability for outcomes.


What we would love to see:
  • Innovative Mindset: A passion for exploring new technologies and methodologies to improve reliability and performance.
  • Proactive Approach: Ability to anticipate potential issues and implement preventive measures.
  • Continuous Improvement: A dedication to learning and growing in your role, staying updated with industry trends and best practices.


To ensure you feel good solving a big Human problem, we offer:
  • A stimulating, fast-paced environment with lots of room for creativity;
  • A bright future at a promising high-tech startup company;
  • Career development and growth, with a competitive salary;
  • The opportunity to work with a talented team and to add real value to an innovative solution with the potential to change the future of healthcare;
  • A flexible environment where you can control your hours (remotely) with unlimited vacation; 
  • Access to our health and well-being program (digital therapist sessions);
  • Remote or Hybrid work policy (Portugal only);
  • To get to know more about our Tech Stack, check here.


Similar Jobs

White board games - 3D Character Artist

White board games

Argentina (Remote)
2 Months ago
warner bros games - Staff Software Engineer - Data Platform

warner bros games

Hyderabad, Telangana, India (Hybrid)
3 Months ago
WebFX - Entry Level Software Engineer

WebFX

Harrisburg, Pennsylvania, United States (On-Site)
8 Months ago
Wargaming - Automation Tech Lead

Wargaming

Warsaw, Masovian Voivodeship, Poland (Hybrid)
2 Weeks ago
Capgemini - Guidewire developer

Capgemini

Chennai, Tamil Nadu, India (On-Site)
1 Month ago
Mashgin - Deployment Engineer - North Carolina

Mashgin

Charlotte, North Carolina, United States (Remote)
8 Months ago
zeta - Lead Site Reliability Engineer

zeta

Hyderabad, Telangana, India (On-Site)
1 Month ago
Adtran - Sr. DevOps Software Engineer

Adtran

Huntsville, Alabama, United States (On-Site)
1 Month ago
Extreme Inc. - Cloud Engineer

Extreme Inc.

Tokyo, Tokyo, Japan (Hybrid)
2 Months ago
Google - Senior Software Developer, Site Reliability Engineering, Google Cloud

Google

Sunnyvale, California, United States (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

White board games - Sound Designer (SSR)

White board games

Argentina (Remote)
2 Months ago
Lytx,  Inc  - Sr. Cloud Infrastructure Engineer

Lytx, Inc

Bengaluru, Karnataka, India (Hybrid)
8 Months ago
Devoteam - ServiceNow Consultant

Devoteam

Bratislava, Bratislava Region, Slovakia (Hybrid)
3 Months ago
BetterMe - UX Researcher

BetterMe

Ukraine (Remote)
3 Weeks ago
Interactive Brokers - Bilingual Trade Issues

Interactive Brokers

Fort Lauderdale, Florida, United States (Hybrid)
3 Weeks ago
Qualcomm - Video Research Engineer

Qualcomm

San Diego, California, United States (On-Site)
1 Month ago
Zeeco, Inc. - Graduate/Trainee Field Service & Commissioning Engineer/Technician

Zeeco, Inc.

Stamford, England, United Kingdom (On-Site)
3 Months ago
Imanage - Senior AI Software Engineer

Imanage

Chicago, Illinois, United States (Hybrid)
4 Months ago
Sailpoint - Senior Consultant

Sailpoint

Australia (Remote)
1 Month ago
Canonical - Embedded Linux Field Engineer for Devices/IoT

Canonical

(Remote)
1 Month ago

Get notifed when new similar jobs are uploaded

Jobs in Porto, Porto District, Portugal

Veeam Software - Junior Developer in QA

Veeam Software

Lisbon, Lisbon, Portugal (On-Site)
20 Hours ago
binance - Binance Accelerator Program - Compliance Analyst

binance

Lisbon, Lisbon, Portugal (Remote)
1 Year ago
fortis games - Senior Test Lead (Game Team)

fortis games

Portugal (Remote)
4 Days ago
Marsh McLennan - Insurance Operations Trainee

Marsh McLennan

Porto, Porto District, Portugal (Hybrid)
4 Weeks ago
Sword Health - DevOps Engineer

Sword Health

Portugal (Hybrid)
3 Months ago
Devoteam - Data Driven | MLOps Engineer

Devoteam

Lisbon, Lisbon, Portugal (Remote)
8 Months ago
miniclip - Senior Game Designer

miniclip

Lisbon, Lisbon, Portugal (On-Site)
4 Days ago
kaizen gaming  - Agile Delivery Lead

kaizen gaming

Lisbon, Lisbon, Portugal (Hybrid)
2 Weeks ago
Sword Health - Industrial Engineer - Supply Chain

Sword Health

Porto, Porto District, Portugal (Hybrid)
1 Month ago
miniclip - Senior Finance Manager

miniclip

Lisbon, Lisbon, Portugal (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Devops Jobs

Zuora - Sr Enterprise Solution Architect - Zuora Revenue

Zuora

United States (Remote)
1 Month ago
E-Hireo - Cloud Engineer

E-Hireo

Bengaluru, Karnataka, India (On-Site)
8 Months ago
Litera - Site Reliability Engineer

Litera

Ahmedabad, Gujarat, India (On-Site)
8 Months ago
appier - Senior Software Engineer, Machine Learning (Enterprise Solution)

appier

Taipei City, Taiwan (On-Site)
1 Month ago
undefined - Site Reliability Engineer (APAC)

Australia (Remote)
2 Months ago
Pay2 - Cloud Infrastructure Engineer (MLOps)

Pay2

Gurugram, India (On-Site)
1 Month ago
Go guardian - Site Reliability Engineer

Go guardian

India (Remote)
9 Months ago
TALA - Senior DevOps Engineer

TALA

Mexico City, Mexico (Remote)
3 Weeks ago
SSC Technologies - Cloud Architect

SSC Technologies

Basildon, England, United Kingdom (On-Site)
4 Weeks ago
Boomi  - Senior DevOps Engineer

Boomi

Ramat Gan, Tel Aviv District, Israel (Hybrid)
1 Week ago

Get notifed when new similar jobs are uploaded

About The Company

United States (Remote)

Porto, Porto District, Portugal (Hybrid)

United States (Remote)

Porto, Porto District, Portugal (Hybrid)

Porto, Porto District, Portugal (Hybrid)

Porto, Porto District, Portugal (Hybrid)

Portugal (Hybrid)

Porto, Porto District, Portugal (Hybrid)

View All Jobs

Get notified when new jobs are added by Sword Health

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug