Site Reliability Engineer
N-ix
Job Summary
N-iX is seeking an experienced Site Reliability Engineer to join a project entering a pivotal phase with a major go-live planned for mid-February, targeting 75,000 users. The role involves ensuring the stability, scalability, and operational excellence of a Kubernetes-based platform in a hybrid environment. Key responsibilities include performance optimization, scaling strategies, observability, and reliability engineering, addressing anticipated challenges with increased user activity.
Must Have
- 4+ years of experience as SRE / DevOps Engineer
- Strong hands-on experience with Kubernetes in production
- Experience working with hybrid infrastructure (on-prem + cloud)
- Solid knowledge of PostgreSQL performance tuning and scaling
- Experience with Qdrant or other vector databases
- Experience with Helm, Kubernetes autoscaling, and resource optimization
- Familiarity with observability stacks (Prometheus, Grafana, ELK/Loki)
- Understanding of performance engineering and load testing
- Experience with Linux systems and networking
- Strong troubleshooting and incident-management skills
Good to Have
- Experience with STACKIT or other sovereign clouds
- Experience with PgBouncer
- Knowledge of SRE practices (SLO/SLI)
- Experience in regulated or public-sector environments
- German language skills
Perks & Benefits
- Flexible working format - remote, office-based or flexible
- A competitive salary and good compensation package
- Personalized career growth
- Professional development tools (mentorship program, tech talks and trainings, centers of excellence, and more)
- Active tech communities with regular knowledge sharing
- Education reimbursement
- Memorable anniversary presents
- Corporate events and team buildings
- Other location-specific benefits
Job Description
Project:
N-iX is a global software solutions and engineering services company
We are looking for an experienced Site Reliability Engineer to ensure the stability, scalability, and operational excellence of a Kubernetes-based platform running in a hybrid environment.
The project is entering a pivotal phase, with a major go-live planned for mid-February and a target audience of 75,000 users. User onboarding is already underway, with over 5,000 users connected and 15,000–20,000 expected to be active by year-end. While the system is stable, we anticipate increased activity and new challenges in January, February, and after the go-live—making this an exciting opportunity to make a real impact. The role focuses on performance optimization, scaling strategies, observability, and reliability engineering.
Required Skills:
- 4+ years of experience as SRE / DevOps Engineer
- Strong hands-on experience with Kubernetes in production
- Experience working with hybrid infrastructure (on-prem + cloud)
- Solid knowledge of PostgreSQL performance tuning and scaling
- Experience with Qdrant or other vector databases
- Experience with Helm, Kubernetes autoscaling, and resource optimization
- Familiarity with observability stacks (Prometheus, Grafana, ELK/Loki)
- Understanding of performance engineering and load testing
- Experience with Linux systems and networking
- Strong troubleshooting and incident-management skills
Nice to Have:
- Experience with STACKIT or other sovereign clouds
- Experience with PgBouncer
- Knowledge of SRE practices (SLO/SLI)
- Experience in regulated or public-sector environments
- German language skills
Responsibilities:
- Operate and optimize hybrid infrastructure (on-prem & STACKIT)
- Manage and scale Kubernetes clusters
- Optimize Helm charts, resource usage, and autoscaling
- Conduct performance, load, and stress testing
- Ensure reliability, availability, and monitoring of production systems
- Tune and operate PostgreSQL
- Operate and optimize vector databases (e.g. Qdrant)
- Implement monitoring, logging, and alerting
- Support incident response and capacity planning
We offer\*:
- Flexible working format - remote, office-based or flexible
- A competitive salary and good compensation package
- Personalized career growth
- Professional development tools (mentorship program, tech talks and trainings, centers of excellence, and more)
- Active tech communities with regular knowledge sharing
- Education reimbursement
- Memorable anniversary presents
- Corporate events and team buildings
- Other location-specific benefits