Machine Learning Evaluation Engineer (Agentic Mobile App Generator)

jetbrains

3+ Years | Amsterdam, Netherlands (On Site) | Full Time | 2 months ago

Apply Now

Job Summary

Join JetBrains' internal accelerator team as a Machine Learning Evaluation Engineer to build evaluation pipelines for AI code generation within the Compose Multiplatform ecosystem. You will design, maintain, and analyze evaluation results for AI-generated Kotlin Multiplatform apps, ensuring quality, security, and performance. This role involves collaborating with AI engineers, mobile experts, and product designers to establish quality standards and continuously improve AI features.

Must Have

Design, build, and maintain evaluation pipelines.
Work with AI engineers, mobile experts, and product designers to establish quality standards and test plans.
Develop ways to assess the completeness, security, and performance of generated apps.
Analyze evaluation results, identify areas to improve code generation, and give feedback to the development team.
Develop and run test plans, test cases, and automated tests.
Improve our testing methods and QA processes in an agile environment.
Join discussions, design reviews, and brainstorming sessions around AI tools.
At least three years of QA or evaluation engineering experience in commercial software, with a strong background in testing complex systems.
Experience or interest in building and maintaining evaluation pipelines for AI-assisted development, prompt engineering, or ML-based code generation.
Experience with data analysis tools or ML experiment tracking platforms.
Understand software testing methods, including functional, performance, and integration testing.
Proficient in Python, Kotlin, Java, Swift, or similar languages.
Work well in distributed, cross-functional teams.
Strong English communication skills and can explain complex ideas clearly.

Good to Have

Always looking for ways to improve developer workflows and productivity.
Experience with low-code and/or no-code platforms or tools.
Familiarity with Compose Multiplatform, building KMP libraries or frameworks, or IDE and plugin development.

Perks & Benefits

Impactful work: Directly influence how future mobile applications are built and tested globally by millions of developers.
Innovative culture: Work in an environment that values innovation, creativity, open communication, and respect.
Cutting-edge tech: Use new technology stacks with minimal bureaucracy, and focus on developing ideas.
Professional growth: Grow as an evaluation engineer through mentorship, teamwork, and learning about AI research and trends.
Work-life balance: Enjoy flexible work and a good work-life balance in a developer-focused culture.

Job Description

Are you passionate about high-quality AI developer tools? We’re looking for a Machine Learning (ML) Evaluation Engineer to join our agentic mobile app generator project. This role focuses on building evaluation pipelines for AI code generation within the Compose Multiplatform ecosystem.

About the team and role

Our internal accelerator team is developing AI agents to generate Kotlin Multiplatform apps with fully working navigation, data persistence, and access to remote data sources. As an ML Evaluation Engineer, you will ensure our AI features work correctly and facilitate continuous improvement of the generated apps. If you’re excited about improving the quality of AI-generated code, skilled in building evaluation systems, and interested in new projects, this role is for you.

In this role, you will:

Design, build, and maintain evaluation pipelines.
Work with AI engineers, mobile experts, and product designers to establish quality standards and test plans.
Develop ways to assess the completeness, security, and performance of generated apps.
Analyze evaluation results, identify areas to improve code generation, and give feedback to the development team.
Develop and run test plans, test cases, and automated tests.
Improve our testing methods and QA processes in an agile environment.
Join discussions, design reviews, and brainstorming sessions around AI tools.

We'd like you to join the team if you:

Have at least three years of QA or evaluation engineering experience in commercial software, with a strong background in testing complex systems.
Have experience or interest in building and maintaining evaluation pipelines for AI-assisted development, prompt engineering, or ML-based code generation.
Have experience with data analysis tools or ML experiment tracking platforms.
Understand software testing methods, including functional, performance, and integration testing.
Are proficient in Python, Kotlin, Java, Swift, or similar languages.
Work well in distributed, cross-functional teams.
Have strong English communication skills and can explain complex ideas clearly.

We’ll be especially thrilled if you:

Are always looking for ways to improve developer workflows and productivity.
Have experience with low-code and/or no-code platforms or tools.
Are familiar with Compose Multiplatform, building KMP libraries or frameworks, or IDE and plugin development.

Why work at JetBrains?

Impactful work: Directly influence how future mobile applications are built and tested globally by millions of developers.
Innovative culture: Work in an environment that values innovation, creativity, open communication, and respect.
Cutting-edge tech: Use new technology stacks with minimal bureaucracy, and focus on developing ideas.
Professional growth: Grow as an evaluation engineer through mentorship, teamwork, and learning about AI research and trends.
Work-life balance: Enjoy flexible work and a good work-life balance in a developer-focused culture.

Want to help shape AI-driven development?

Apply now for JetBrains' new agentic mobile app generator project. Tell us about yourself and why you want to build the next generation of agentic AI. We look forward to hearing from you!

Create a Job Alert

Interested in building your career at JetBrains? Get future opportunities sent straight to your email.

Create alert

Apply for this job

------------------

indicates a required field

Autofill with MyGreenhouse

First Name*

Last Name*

Preferred First Name

Email*

Phone

Country

Phone

Resume/CV*

AttachAttach

Enter manuallyEnter manually

Accepted file types: pdf, doc, docx, txt, rtf

Cover Letter

AttachAttach

Enter manuallyEnter manually

Accepted file types: pdf, doc, docx, txt, rtf

LinkedIn Profile

Website/Portfolio/Github Link

How did you hear about Jetbrains?

Select...

By checking this box, I agree to allow JetBrains to retain my data for future opportunities for employment for up to 2190 days after the conclusion of consideration of my current application for employment.

Submit application

14 Skills Required For This Role

Team Management Cross Functional Communication Data Analytics Github Game Texts Quality Control Agile Development Test Coverage Kotlin Python Swift Java Machine Learning

Similar Jobs

Research Development

Software Engineer, BigQuery AI Developer Experience

Google • Kirkland, Washington, United States of America (On Site)