Craft: Technology & Development
Job Description:
Senior Platform Engineer
------------------------
Location: Stockholm, Malmö, Barcelona, London, Berlin
Team: ML Toolchain
About the Role
We’re looking for a Senior Platform Engineer with a strong focus on cloud infrastructure to join our ML Toolchain team, the team building the foundational tooling that enables machine learning at scale across King. You’ll play a key role in designing and building a modern, developer-friendly ML platform on top of Google Cloud Platform (GCP), Kubernetes, and a suite of open-source tools.
The platform is primarily built in Python, exposing APIs and a front-end to support ML practitioners across the company. As an engineer in this team, you'll work across infrastructure and software boundaries, driving the platform's evolution with a focus on usability, scalability, and reliability.
What You’ll Do
- Design, build, and improve cloud infrastructure components that support ML workloads and shared services across the organisation
- Contribute to the standardisation and consolidation of infrastructure by driving migrations to a centralised, platform-aligned model
- Develop and maintain self-service tooling to empower ML engineers and developer teams to deploy, monitor, and manage their own services with minimal friction
- Collaborate with platform teams, ML engineers, and service owners to identify pain points, reduce fragmentation, and improve infrastructure reliability and performance
- Implement and evolve practices around automation, observability, CI/CD, and infrastructure security
- Stay hands-on with emerging cloud-native technologies and drive their adoption where relevant (e.g., CNCF tools, open source frameworks, Google Cloud services)
- Work in Python-based environments, contributing to the stability and scalability of ML-native workflows, pipelines, and orchestration systems
- Contribute to knowledge-sharing, code reviews, and documentation practices that strengthen the team’s effectiveness
- Understand and promote the principles of platform engineering
Required Skills
- Extensive experience with cloud platforms, particularly Google Cloud Platform (GCP), its core services and best practices
- Deep proficiency in Kubernetes (GKE) and container orchestration
- Strong background in Infrastructure as Code (IaC) using Terraform, Kustomize and Helm (or similar tools)
- Proven experience building and operating CI/CD pipelines, using tools like Cloud Build and ArgoCD
- Solid programming skills in Python
- Hands-on experience in designing, building, and testing APIs and self-service platforms
- Strong grasp of Git, GitOps, coding standards, and code review practices
- Experience managing and working with CloudSQL databases like PostgreSQL and MySQL
- Strong understanding of observability practices, including metrics, logging, dashboards, and alerting
- Experience using Crossplane for cloud-native provisioning and Argo Workflows & Events for orchestration
- Expertise in Linux, along with solid understanding of networks, DNS, firewalls, and security best practices
- Strong communicator, fluent in spoken and written English
- Demonstrated ability to work independently and collaborate effectively across teams
Nice to Have skills
- Familiarity with ML infrastructure and tools, including Vertex AI, DataProc, Spark, MLFlow, and Ray, or similar systems
- Knowledge of Java and/or Go (including Go templating for Crossplane)
- Familiarity with MLOps workflows