Senior Site Reliability Engineer (SRE)

1 Year ago • 5 Years + • Devops

Job Summary

Job Description

As a Site Reliability Engineer (SRE) at Sword Health, you will be vital in maintaining the health and uptime of our services. You will collaborate with development teams to build and operate scalable and resilient systems, troubleshoot issues across the stack, and implement automation to reduce manual work. You'll be responsible for monitoring, incident management, automation, tooling, performance optimization, security, compliance, documentation, knowledge sharing, and database management. You'll be working with a team of over 900 talented colleagues across three continents to build a pain-free world, powered by AI, enhanced by people — accessible to all.
Must have:
  • Proficiency in programming languages like Python, Go, Javascript.
  • 5+ years of experience with cloud platforms such as AWS, Google Cloud, or Azure.
  • Strong understanding of Linux/Unix systems and networking.
  • Familiarity with containerization and orchestration tools (e.g., Docker, Kubernetes).
  • Experience with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack).
  • Knowledge of CI/CD pipelines and tools (e.g., Jenkins, GitLab CI).
  • Database Experience: Proficiency with relational and NoSQL databases (e.g., MySQL, PostgreSQL, Redis, Elasticsearch).
  • Team Player: Willingness to collaborate and share knowledge with colleagues to drive collective success.
  • Ownership: Taking responsibility for your work and demonstrating accountability for outcomes.
Good to have:
  • Innovative Mindset: A passion for exploring new technologies and methodologies to improve reliability and performance.
  • Proactive Approach: Ability to anticipate potential issues and implement preventive measures.
  • Continuous Improvement: A dedication to learning and growing in your role, staying updated with industry trends and best practices.
Perks:
  • A stimulating, fast-paced environment with lots of room for creativity;
  • A bright future at a promising high-tech startup company;
  • Career development and growth, with a competitive salary;
  • The opportunity to work with a talented team and to add real value to an innovative solution with the potential to change the future of healthcare;
  • A flexible environment where you can control your hours (remotely) with unlimited vacation; 
  • Access to our health and well-being program (digital therapist sessions);
  • Remote or Hybrid work policy (Portugal only);

Job Details

Sword Health is on a mission to free two billion people from pain. 


With 67% of members achieving a pain-free life and a 70% reduction in surgery intent, at Sword, we are using AI Care to change lives, and save millions for our 25,000+ enterprise clients across three continents. Today, we hold the majority of industry patents, win 70% of competitive evaluations, and have raised more than $300 million from top venture firms like Founders Fund, Sapphire Ventures, General Catalyst, and Khosla Ventures.


Recognized as a Forbes Best Startup Employer in 2025, this award highlights our focus on being a destination for the best and brightest  talent. Not only have we experienced unprecedented growth since our market debut in 2020,  but we’ve also created a remarkable mission and value-driven environment that is loved by our growing team. With a recent valuation of $3 billion, we are in a phase of hyper growth and expansion, and we’re looking for individuals with passion, commitment, and energy to help us scale our global impact. 


Joining Sword means committing to a set of core values, chief amongst them to “do it for the patients” every day, and to always “deliver more than expected” on behalf of our members and clients.


This is an opportunity for you to make a significant difference on a massive scale as you work alongside 900+ (and growing!) talented colleagues, spanning three continents. Your charge? To help us build a pain-free world, powered by AI, enhanced by people — accessible to all.



As a Site Reliability Engineer (SRE) at Sword Health, you will play a critical role in maintaining the health and uptime of our services. You will collaborate with development teams to build and operate scalable and resilient systems, troubleshoot issues across the stack, and implement automation to reduce manual work.


What you'll be doing:
  • Monitoring and Incident Management: Develop and maintain monitoring and alerting solutions. Respond to incidents, troubleshoot issues, and perform root cause analysis.
  • Automation and Tooling: Automate repetitive tasks and improve deployment processes. Develop and maintain tools to support infrastructure and applications.
  • Performance Optimization: Analyze system performance and implement optimizations to improve efficiency and reduce latency.
  • Security and Compliance: Ensure systems are secure and compliant with relevant standards and regulations.
  • Documentation and Knowledge Sharing: Maintain comprehensive documentation of systems and processes. Share knowledge and best practices with team members.
  • Database Management: Ensure the reliability, performance, and scalability of databases. Perform database optimization, maintenance, and troubleshooting.


What you need to have:
  • Proficiency in programming languages such as Python, Go, Javascript.
  • 5+ years of experience with cloud platforms such as AWS, Google Cloud, or Azure.
  • Strong understanding of Linux/Unix systems and networking.
  • Familiarity with containerization and orchestration tools (e.g., Docker, Kubernetes).
  • Experience with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack).
  • Knowledge of CI/CD pipelines and tools (e.g., Jenkins, GitLab CI).
  • Database Experience: Proficiency with relational and NoSQL databases (e.g., MySQL, PostgreSQL, Redis, Elasticsearch).
  • Team Player: Willingness to collaborate and share knowledge with colleagues to drive collective success.
  • Ownership: Taking responsibility for your work and demonstrating accountability for outcomes.


What we would love to see:
  • Innovative Mindset: A passion for exploring new technologies and methodologies to improve reliability and performance.
  • Proactive Approach: Ability to anticipate potential issues and implement preventive measures.
  • Continuous Improvement: A dedication to learning and growing in your role, staying updated with industry trends and best practices.


To ensure you feel good solving a big Human problem, we offer:
  • A stimulating, fast-paced environment with lots of room for creativity;
  • A bright future at a promising high-tech startup company;
  • Career development and growth, with a competitive salary;
  • The opportunity to work with a talented team and to add real value to an innovative solution with the potential to change the future of healthcare;
  • A flexible environment where you can control your hours (remotely) with unlimited vacation; 
  • Access to our health and well-being program (digital therapist sessions);
  • Remote or Hybrid work policy (Portugal only);
  • To get to know more about our Tech Stack, check here.


Similar Jobs

Abridge - Operations Manager, Technical Support

Abridge

San Francisco, California, United States (Remote)
1 Month ago
Synechron - Solution Architect

Synechron

Chennai, Tamil Nadu, India (On-Site)
3 Weeks ago
Qualcomm - Senior Engineer- Python automation framework Machine learning

Qualcomm

Hyderabad, Telangana, India (On-Site)
1 Week ago
Next Level Business Services - Java Developer with Oracle SOA

Next Level Business Services

Cincinnati, Ohio, United States (On-Site)
9 Months ago
Cerence - Information Security and Compliance Manager

Cerence

(Remote)
2 Months ago
Scientific Games - DevOps Engineer

Scientific Games

Moncton, New Brunswick, Canada (On-Site)
1 Week ago
Apple - Software Engineering - DevOps Engineer

Apple

San Diego, California, United States (On-Site)
2 Months ago
London stock Exchange - Application Technical Support Engineer (SRE Engineer)

London stock Exchange

Taipei City, Taiwan (Hybrid)
2 Months ago
NCR Voyix - DevOps Engineer

NCR Voyix

Hyderabad, Telangana, India (On-Site)
2 Months ago
2K - Senior Site Reliability Engineer

2K

Novato, California, United States (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Alpha Sense - Lead AI Platform Engineer

Alpha Sense

New York, New York, United States (On-Site)
8 Months ago
Winzo - Influencer Marketing

Winzo

India (On-Site)
1 Month ago
Wind River - Senior Kernel Software Engineer

Wind River

Walnut Creek, California, United States (Hybrid)
1 Month ago
Sonar Source - Solutions Engineer - Dubai

Sonar Source

Dubai, Dubai, United Arab Emirates (Remote)
9 Months ago
Rackspace Technology - Senior Engineer - Windows + Azure

Rackspace Technology

Gurugram, Haryana, India (On-Site)
1 Month ago
Sprinkler - Senior Implementation Consultant – Voice & Telephony

Sprinkler

Austin, Texas, United States (Remote)
1 Week ago
endava - Senior Cloud Operations Engineer

endava

Timișoara, Timiș, Romania (On-Site)
1 Month ago
Natural motion games - Senior Director of Product - Unannounced Project

Natural motion games

London, England, United Kingdom (Hybrid)
3 Months ago
Plug power - Senior Product Service and Sustaining Engineer, Electrical

Plug power

Albany, New York, United States (On-Site)
2 Weeks ago
NXP - Software Internship – Automotive Security Firmware

NXP

Bucharest, Romania (On-Site)
2 Weeks ago

Get notifed when new similar jobs are uploaded

Jobs in Porto, Porto District, Portugal

Tesla - Delivery Advisor

Tesla

Lisbon, Lisbon, Portugal (On-Site)
5 Months ago
Sword Health - Site Reliability Engineer (SRE)

Sword Health

Portugal (Hybrid)
4 Months ago
FunPlus - Game Shader Artist

FunPlus

Lisbon, Lisbon, Portugal (Hybrid)
3 Months ago
miniclip - Data Engineer

miniclip

Lisbon, Lisbon, Portugal (On-Site)
2 Months ago
fortis games - Sr. QA Automation Engineer (Analytics-Data)

fortis games

Portugal (Remote)
2 Months ago
Sword Health - Product Manager

Sword Health

Porto, Porto District, Portugal (Hybrid)
1 Month ago
Sword Health - Junior Associate to CEO - Corporate

Sword Health

Porto, Porto District, Portugal (Hybrid)
1 Week ago
Sword Health - Mobile Engineer Trainee

Sword Health

Porto, Porto District, Portugal (Hybrid)
1 Month ago
FunPlus - Junior DevOps Engineer

FunPlus

Lisbon, Lisbon, Portugal (Hybrid)
3 Months ago
Veeam Software - Junior QA Engineer

Veeam Software

Lisbon, Lisbon, Portugal (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Devops Jobs

Sabre India - Solutions Engineer

Sabre India

Kraków, Lesser Poland Voivodeship, Poland (Hybrid)
2 Months ago
FICO - Customer Support Engineer II (Java, AWS)

FICO

United States (Remote)
1 Month ago
JDA - Senior Solutions Architect - WMS (German Speaking)

JDA

Hamburg, Hamburg, Germany (On-Site)
2 Weeks ago
Activision - Staff Platform Solutions Engineer

Activision

New York, United States (On-Site)
1 Month ago
ARHS - DevOps Engineer

ARHS

Luxembourg (On-Site)
2 Months ago
Luxoft - Google Cloud Engineer

Luxoft

New Delhi, Delhi, India (Remote)
8 Months ago
Qualcomm - Site Reliability Engineer (SRE) – Automotive IT

Qualcomm

San Diego, California, United States (On-Site)
2 Months ago
Salesforce - Lead Solution Engineer

Salesforce

London, England, United Kingdom (On-Site)
2 Months ago
NVIDIA - Technical Marketing Engineer - AI Platform Software

NVIDIA

Canada (Hybrid)
4 Months ago
C3 IoT - Senior Solution Engineer

C3 IoT

Redwood City, California, United States (On-Site)
3 Weeks ago

Get notifed when new similar jobs are uploaded