MLOps Engineer

50 Minutes ago • All levels
Research Development

Job Description

At Springer Nature AI Labs (SNAIL), we are seeking an experienced MLOps Engineer to build and operate end-to-end ML/LLM pipelines in the cloud. This role involves automating workflows with Kubeflow and GitHub Actions, serving models using Docker and FastAPI on Google Cloud Vertex AI and Kubernetes, and ensuring robust observability and monitoring. The ideal candidate will optimize cloud costs, collaborate with ML and DataOps Engineers, and contribute to a culture of innovation and continuous improvement, while also mentoring junior team members.
Good To Have:
  • LLMOps: prompt and experiment tracking (e.g., Langfuse), evaluation frameworks, guardrails, vector databases (e.g. Pinecone)
  • Infrastructure as Code and packaging: Terraform / Pulumi
  • Experience or understanding of deploying in Kubernetes
  • Inference optimization: model quantization, Triton/vLLM/TensorRT, GPU operators and scheduling
Must Have:
  • BSc or MSc in Math, Physical Sciences, Computer Science, Software Engineering, AI/ML or related field
  • Experienced Python knowledge, testing practices, Git/GitHub, GitHub Actions, Docker
  • Experience building APIs with FastAPI or similar
  • Hands-on experience with at least one major cloud provider (GCP/AWS/Azure) and core services
  • Experience with pipeline orchestration (Airflow/Kubeflow), experiment tracking/model registry
  • Practical experience setting up dashboards and alerts (e.g., Prometheus/Grafana/OpenTelemetry) and ML-specific monitoring
  • Experience with PyTorch or TensorFlow in production contexts
Perks:
  • Opportunities to learn from industry experts
  • Culture that encourages curiosity and empowers problem-solving
  • Commitment to employee nurturing and development for full potential
  • Collaborative and innovative work environment
  • Diverse, equitable, and inclusive workplace

Add these skills to join the top 1% applicants for this job

team-management
github
game-texts
user-experience-ux
prototyping
networking
aws
azure
prometheus
terraform
grafana
fastapi
pytorch
ci-cd
docker
kubernetes
git
python
github-actions
tensorflow

About Springer Nature Group

Springer Nature opens the doors to discovery for researchers, educators, clinicians and other professionals. Every day, around the globe, our imprints, books, journals, platforms and technology solutions reach millions of people. For over 180 years our brands and imprints have been a trusted source of knowledge to these communities and today, more than ever, we see it as our responsibility to ensure that fundamental knowledge can be found, verified, understood and used by our communities – enabling them to improve outcomes, make progress, and benefit the generations that follow. Visit group.springernature.com and follow @SpringerNature / @SpringerNatureGroup

Department: Springer Nature AI Labs

Who we are

At Springer Nature AI Labs (SNAIL), we’re shaping the future of scientific publishing through responsible, human-centred AI. Our team is at the forefront of integrating advanced AI technologies to optimize processes and enhance the user experience for researchers and academics worldwide. We value a collaborative work environment where ideas flourish, and innovation is encouraged. With our curiosity-driven, impact-first culture, we focus on delivering AI innovation at scale always with integrity and in close collaboration across functions. Our commitment to long-term growth ensures that our people are nurtured and developed to reach their full potential.

Who you are

You are an experienced MLOps engineer who loves turning prototypes into reliable, scalable AI systems in the cloud. You balance speed with robustness, automate everything you can, and care deeply about reproducibility, observability and cost efficiency. You are comfortable in a fast-moving environment and enjoy solving complex infrastructure problems. As an experienced engineer, you are happy to mentor junior teammates while continuously improving yourself in this fast-paced field. You thrive in a culture of proactivity, curiosity, experimentation, and teamwork.

What You’ll Do

  • Build and operate end-to-end ML/LLM pipelines: data ingestion, feature processing, training, evaluation, packaging, registry and deployment.
  • Automate workflows: design fault-tolerant training/inference pipelines with Kubeflow; implement CI/CD for ML with GitHub Actions and reusable templates.
  • Serve models: containerize ML models (Docker), expose APIs (FastAPI), deployments in Google Cloud Vertex AI and Kubernetes.
  • Ensure observability and monitoring: implement metrics, logs and traces; set up model/data quality checks, drift detection and alerting
  • Optimize cloud cost and performance: finetune the usage of compute resources and apply cloud best practices.
  • Collaborate: work with ML Engineers and Data(Ops) Engineers to deliver quality products; review code and documentation; apply best coding practices for maintainable and reusable code; support junior colleagues to grow the MLOps capabilities.
  • Contribute to our culture: bring an experimentation mindset, propose improvements, and help us stay current with modern MLOps tooling and practices.
  • Coach and mentor more junior team members, ensuring that MLOps skills and capabilities are developed at scale

Must-Have Qualifications

  • Education: BSc or MSc in Math, Physical Sciences, Computer Science, Software Engineering, AI/ML or related field.
  • Software: Experienced Python knowledge, testing practices, Git/GitHub, GitHub Actions, Docker; experience building APIs with FastAPI or similar.
  • Cloud: hands-on experience with at least one major provider (GCP/AWS/Azure) and core services (compute, storage, networking, AI platforms).
  • MLOps: experience with pipeline orchestration (Airflow/Kubeflow), experiment tracking/model registry.
  • Monitoring: practical experience setting up dashboards and alerts (e.g., Prometheus/Grafana/OpenTelemetry) and ML-specific monitoring for data drift and performance.
  • Frameworks: PyTorch or TensorFlow in production contexts.
  • Communication: clear, proactive communicator in English; able to collaborate with diverse stakeholders.
  • Work mode: open to hybrid work; team typically spends ~2 days/week in the office.

Nice‑to‑Have

  • LLMOps: prompt and experiment tracking (e.g., Langfuse), evaluation frameworks, guardrails, vector databases (e.g. Pinecone).
  • Infrastructure as Code and packaging: Terraform / Pulumi
  • Experience or understanding of deploying in Kubernetes
  • Inference optimization: model quantization, Triton/vLLM/TensorRT, GPU operators and scheduling.

By joining Springer Nature, you will actively contribute to the development and implementation of AI solutions that drive the future of scientific publishing. As a leader, you will guide your team to innovate and grow, pushing the boundaries of what’s possible in AI. Join us as we pioneer the future of scientific publishing through artificial intelligence.

Internal applicants:

We encourage you to speak with your manager once the interview process has started. At the point of offer acceptance, it is required that you inform your manager. If for any reason you’re unable to do so, please contact HR who can provide guidance as required.

At Springer Nature we value the diversity of our teams. We recognize the many benefits of a diverse workforce with equitable opportunities for everyone. We strive for an inclusive workplace that empowers all our colleagues to thrive. Our search for the best talent fully encompasses and embraces these values and principles. Springer Nature was awarded Diversity Team of the Year at the 2022 British Diversity Awards. Find out more about our DEI work here https://group.springernature.com/gp/group/taking-responsibility/diversity-equity-inclusion

For more information about career opportunities in Springer Nature please visit https://careers.springernature.com/

#LI-AR1

Set alerts for more jobs like MLOps Engineer
Set alerts for new jobs by Springer Group
Set alerts for new Research Development jobs in Netherlands
Set alerts for new jobs in Netherlands
Set alerts for Research Development (Remote) jobs

Contact Us
hello@outscal.com
Made in INDIA 💛💙