ML Engineer

5 Minutes ago • 3 Years + • Research Development

Job Summary

Job Description

At Springer Nature AI Labs (SNAIL), we’re shaping the future of scientific publishing through responsible, human-centred AI. Our team is at the forefront of integrating advanced AI technologies to optimize processes and enhance the user experience for researchers and academics worldwide. We value a collaborative work environment where ideas flourish, and innovation is encouraged. With our curiosity-driven, impact-first culture, we focus on delivering AI innovation at scale always with integrity and in close collaboration across functions. Our commitment to long-term growth ensures that our people are nurtured and developed to reach their full potential. As an ML Engineer focused on LLM evaluation, you will design and build both qualitative and quantitative frameworks for assessing large language model outputs, optimize prompts and workflows, and collaborate with cross‑functional teams to ensure our generative AI solutions meet rigorous standards of quality, reliability and ethics.
Must have:
  • Develop Evaluation Frameworks: Architect end-to-end pipelines that combine automated metrics (BLEU, ROUGE, BERTScore, custom error rates) with human-in-the-loop assessments.
  • Quantitative Analysis: Implement statistical and machine-learning methods to measure LLM performance—accuracy, relevance, bias, fairness, robustness—and analyze trends over releases.
  • Qualitative Assessment: Design annotation guidelines, recruit/train reviewers, and lead structured reviews of model outputs for coherence, factuality and style.
  • Prompt Engineering & Optimization: Use tools like DSPy to craft, test and (automatically) refine prompts; analyze A/B test experiments to maximize response quality and task success.
  • Custom Tooling: Build reusable Python libraries and dashboards for monitoring LLM behaviour, automating evaluation workflows and integrating with our CI/CD approaches.
  • Collaboration & Reporting: Partner with research, product and MLOps teams to translate user needs into evaluation requirements; present findings, drive data-backed decisions and iterate on model improvements.
  • Best Practices & Ethics: Champion documentation, version control, testing standards and fairness audits. Stay up to date on responsible AI guidelines and industry benchmarks.
Good to have:
  • Knowledge of fairness, bias detection and mitigation techniques for generative models.
  • Experience with open-source (self-hosted) LLMs (e.g. LLaMA, Qwen)
  • Experience with LLM tracing and prompt management platforms (e.g. Langfuse)
Perks:
  • Our people are nurtured and developed to reach their full potential.
  • Opportunities to learn from some of the best in the business.
  • Culture that encourages curiosity and empowers people to find solutions and act on their instincts.
  • Recognize the many benefits of a diverse workforce with equitable opportunities for everyone.
  • Strive for an inclusive workplace that empowers all our colleagues to thrive.

Job Details

About Springer Nature Group

Springer Nature opens the doors to discovery for researchers, educators, clinicians and other professionals. Every day, around the globe, our imprints, books, journals, platforms and technology solutions reach millions of people. For over 180 years our brands and imprints have been a trusted source of knowledge to these communities and today, more than ever, we see it as our responsibility to ensure that fundamental knowledge can be found, verified, understood and used by our communities – enabling them to improve outcomes, make progress, and benefit the generations that follow. Visit group.springernature.com and follow @SpringerNature / @SpringerNatureGroup

Who we are

At Springer Nature AI Labs (SNAIL), we’re shaping the future of scientific publishing through responsible, human-centred AI. Our team is at the forefront of integrating advanced AI technologies to optimize processes and enhance the user experience for researchers and academics worldwide. We value a collaborative work environment where ideas flourish, and innovation is encouraged. With our curiosity-driven, impact-first culture, we focus on delivering AI innovation at scale always with integrity and in close collaboration across functions. Our commitment to long-term growth ensures that our people are nurtured and developed to reach their full potential.

Who you are

As an ML Engineer focused on LLM evaluation, you will design and build both qualitative and quantitative frameworks for assessing large language model outputs, optimize prompts and workflows, and collaborate with cross‑functional teams to ensure our generative AI solutions meet rigorous standards of quality, reliability and ethics.

What You’ll Do

  • Develop Evaluation Frameworks: Architect end-to-end pipelines that combine automated metrics (BLEU, ROUGE, BERTScore, custom error rates) with human‑in‑the‑loop assessments.
  • Quantitative Analysis: Implement statistical and machine‑learning methods to measure LLM performance—accuracy, relevance, bias, fairness, robustness—and analyze trends over releases.
  • Qualitative Assessment: Design annotation guidelines, recruit/train reviewers, and lead structured reviews of model outputs for coherence, factuality and style.
  • Prompt Engineering & Optimization: Use tools like DSPy to craft, test and (automatically) refine prompts; analyze A/B test experiments to maximize response quality and task success.
  • Custom Tooling: Build reusable Python libraries and dashboards for monitoring LLM behaviour, automating evaluation workflows and integrating with our CI/CD approaches.
  • Collaboration & Reporting: Partner with research, product and MLOps teams to translate user needs into evaluation requirements; present findings, drive data‑backed decisions and iterate on model improvements.
  • Best Practices & Ethics: Champion documentation, version control, testing standards and fairness audits. Stay up to date on responsible AI guidelines and industry benchmarks.

Must-Have Qualifications

  • Education: MSc or higher in CS, Engineering, Data Science or related.
  • AI/ML Expertise: deep knowledge of ML algorithms, with a focus on NLP and transformers.
  • GenAI Expertise: 1+ years experience evaluating, optimizing, and productionzing GenAI products
  • Software/Cloud: 3+ years production experience with Python; experience with Docker, Kubernetes, FastAPI; hands-on experience with any major cloud provider (GCP/Azure/AWS)
  • MLOps: familiarity with CI/CD for models, monitoring, versioning, pipelines (e.g. KubeFlow)
  • Communication: business-fluent English; able to translate complex concepts for diverse stakeholders.

Nice‑to‑Have

  • Knowledge of fairness, bias detection and mitigation techniques for generative models.
  • Experience with open‑source (self-hosted) LLMs (e.g. LLaMA, Qwen)
  • Experience with LLM tracing and prompt management platforms (e.g. Langfuse)

By joining Springer Nature, you will actively contribute to the development and implementation of AI solutions that drive the future of scientific publishing. As a leader, you will guide your team to innovate and grow, pushing the boundaries of what’s possible in AI. Join us as we pioneer the future of scientific publishing through artificial intelligence.

Internal applicants: We encourage that you speak to your manager once the interview process has started. At the point of offer acceptance, it is required that you inform your manager. If for any reason you’re unable to do so, please contact HR who can provide guidance as required.

At Springer Nature we value the diversity of our teams. We recognize the many benefits of a diverse workforce with equitable opportunities for everyone. We strive for an inclusive workplace that empowers all our colleagues to thrive. Our search for the best talent fully encompasses and embraces these values and principles. Springer Nature was awarded Diversity Team of the Year at the 2022 British Diversity Awards. Find out more about our DEI work here https://group.springernature.com/gp/group/taking-responsibility/diversity-equity-inclusion

For more information about career opportunities in Springer Nature please visit https://careers.springernature.com/

Similar Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Similar Skill Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Jobs in Groningen, Groningen, Netherlands

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Research Development Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

About The Company

We are an ambitious and dynamic organisation, and home to some of the best-known names in research, educational and professional publishing. Working at the heart of a changing industry, we are always looking for great people who care about delivering quality to our customers and the communities we work alongside. In return, you will find that we open the doors to discovery for all our employees – offering opportunities to learn from some of the best in the business, with a culture that encourages curiosity and empowers people to find solutions and act on their instincts.

Warsaw, Masovian Voivodeship, Poland (Hybrid)

London, England, United Kingdom (Hybrid)

Pune, Maharashtra, India (On-Site)

Groningen, Groningen, Netherlands (On-Site)

Pune, Maharashtra, India (On-Site)

Munich, Bavaria, Germany (Hybrid)

London, England, United Kingdom (Hybrid)

Johannesburg, Gauteng, South Africa (On-Site)

Munich, Bavaria, Germany (Hybrid)

View All Jobs

Get notified when new jobs are added by Springer Group

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug