AI Evaluation Manager

1 Month ago • 5 Years + • Research Development • $300,000 PA - $350,000 PA

Job Summary

Job Description

Luma is seeking an AI Evaluation Manager to shape and scale the understanding, measurement, and improvement of generative AI model performance. This role involves partnering with researchers, engineers, and technical artists to evaluate models against real-world creative use cases. The manager will design frameworks for qualitative nuance and identify actionable insights to guide development, focusing on building evaluative systems that match the complexity of human perception and creativity, rather than simply checking metrics.
Must have:
  • Evaluate generative model performance.
  • Identify failure modes and regressions.
  • Develop scalable qualitative evaluation frameworks.
  • Collaborate with technical artists and engineers.
  • Translate product goals into evaluative criteria.
  • Lead qualitative studies and human-in-the-loop evaluations.
  • Provide feedback for model fine-tuning.
  • Stay informed about generative AI evaluation standards.
  • Master's degree in relevant field or equivalent experience.
  • 5+ years in product evaluation or UX research.
  • Familiarity with creative workflows and generative models.
  • Strong systems thinking for defining abstract qualities.
  • Experience working cross-functionally.
  • Excellent written communication and synthesis skills.
Good to have:
  • Background in motion, visual effects, or storytelling.
  • Experience evaluating AI-generated media.
  • Experience building internal qualitative data tools.
  • Familiarity with prompt engineering.

Job Details

About the Role

Luma is pushing the boundaries of generative AI, building tools that redefine how visual content is created. We’re seeking a candidate to help shape and scale the way we understand, measure, and improve model performance. In this role, you’ll partner with researchers, engineers, and technical artists to evaluate our models against real-world creative use cases, design frameworks that capture qualitative nuance, and identify actionable insights that guide development.

This is not a checkbox metrics role — it's about building evaluative systems that match the complexity of human perception, creativity, and intention.

Responsibilities

  • Evaluate generative model performance across diverse tasks, prompts, and modalities.

  • Identify key failure modes, regression patterns, and edge cases that impact product quality.

  • Develop and maintain qualitative evaluation frameworks that are scalable and reusable.

  • Collaborate closely with technical artists and engineers to align evaluations with model capabilities and target use cases.

  • Translate high-level product goals into concrete evaluative criteria.

  • Lead qualitative studies, side-by-side comparisons, and human-in-the-loop evaluation efforts.

  • Provide detailed feedback that informs model fine-tuning, dataset curation, and product UX.

  • Stay informed about emerging evaluation standards in generative AI and creative tools.

Qualifications

  • Master’s degree or higher in Cognitive Science, Human-Computer Interaction (HCI), Design Research, Psychology, Media Studies, or a related field.

  • 5+ years of experience in product evaluation, UX research, model testing, or similar roles that involve structured qualitative assessment.

  • Deep familiarity with creative workflows and real-world use cases for generative models (e.g., animation, filmmaking, digital art, VFX).

  • Strong systems thinking and the ability to define abstract qualities (like believability, identity retention, or scene coherence) in clear evaluative terms.

  • Experience working cross-functionally with engineers, researchers, and creatives.

  • Excellent written communication skills and the ability to synthesize nuanced judgments into clear, actionable insights.

Nice to Have

  • Background in motion, visual effects, or storytelling pipelines

  • Experience evaluating AI-generated media (video, images, 3D)

  • Prior work on building internal tools for qualitative data collection or scoring

  • Familiarity with prompt engineering and reference-based input methods

Similar Jobs

CD PROJEKT RED - Senior VFX Artist

CD PROJEKT RED

Warsaw, Masovian Voivodeship, Poland (Hybrid)
3 Months ago
Sphere Entertainment Co - Show of Interest - After Effects Artist - Project Based

Sphere Entertainment Co

Burbank, California, United States (On-Site)
3 Months ago
Techland - Creative Marketing Director

Techland

Warsaw, Masovian Voivodeship, Poland (On-Site)
3 Months ago
Lighthouse Games - Lead VFX Artist

Lighthouse Games

Royal Leamington Spa, England, United Kingdom (Hybrid)
4 Months ago
Intel  - Graph Compiler Deep Learning Engineer

Intel

Petah Tikva, Center District, Israel (On-Site)
1 Month ago
AI Fund - AI Engineer

AI Fund

Palo Alto, California, United States (Hybrid)
2 Months ago
Qube Cinema - AI Workflow Lead – Localization & Accessibility

Qube Cinema

Chennai, Tamil Nadu, India (On-Site)
3 Months ago
Lorikeet - Forward Deployed AI Engineer

Lorikeet

United States (Remote)
2 Months ago
Fireworks AI - AI Researcher

Fireworks AI

Redwood City, California, United States (Hybrid)
1 Month ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Zoic studios - Senior Compositor

Zoic studios

Vancouver, British Columbia, Canada (Remote)
2 Months ago
Just wont die - Senior VFX ARTIST

Just wont die

Cambridge, England, United Kingdom (On-Site)
3 Months ago
koala fx - VFX Project Manager

koala fx

United Kingdom (On-Site)
1 Month ago
hogarth - CGI Lead Creative Director

hogarth

Mexico City, Mexico (Hybrid)
2 Months ago
Bondex - 3D Stylized Environment Artist

Bondex

Thailand (Remote)
1 Year ago
Scanline VFX - Senior Rigger

Scanline VFX

Hyderabad, Telangana, India (Hybrid)
1 Month ago
Scanline VFX - VFX Producer

Scanline VFX

Seoul, South Korea (On-Site)
1 Year ago
Haptic  - Senior VFX Artist

Haptic

Paris, Île-de-France, France (Remote)
7 Months ago
lamppost vfx - Nuke Digital Compositor

lamppost vfx

Barcelona, Catalonia, Spain (On-Site)
1 Month ago
2K - Senior Technical UI Artist

2K

San Mateo, California, United States (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Jobs in Palo Alto, California, United States

Yahoo - Senior Principal Data Engineer - Consumer Monetization Platform

Yahoo

United States (Hybrid)
3 Weeks ago
Abridge - Implementation Engineer, Federal Accounts

Abridge

United States (Remote)
1 Month ago
Thatch.ai  - Senior Counsel

Thatch.ai

United States (Remote)
2 Weeks ago
Twitch - Senior Applied Scientist

Twitch

Seattle, Washington, United States (On-Site)
1 Month ago
Patreon - Staff Backend Engineer, Payments

Patreon

San Francisco, California, United States (On-Site)
1 Month ago
CyberArk - Senior Backend Software Engineer - Golang

CyberArk

Salt Lake City, Utah, United States (Hybrid)
3 Months ago
Open Systems Technologies - Department Supervisor

Open Systems Technologies

North Charleston, South Carolina, United States (On-Site)
3 Weeks ago
Mercury - Engineering Manager - Mobile

Mercury

San Francisco, California, United States (Remote)
4 Weeks ago
FICO - Business Operations Analyst – Platform Support

FICO

United States (Remote)
2 Months ago
OKX - Head of FinCrime, Internal Audit

OKX

San Jose, California, United States (On-Site)
3 Months ago

Get notifed when new similar jobs are uploaded

Research Development Jobs

Apple - AIML - Staff Machine Learning Engineer, Answers Knowledge and Information

Apple

Santa Clara, California, United States (On-Site)
3 Weeks ago
Gameopedia - AI/Agent Engineer

Gameopedia

Norway (Hybrid)
4 Months ago
Sailpoint - Staff Machine Learning Engineer

Sailpoint

United States (Remote)
2 Months ago
whoop - Senior Software Engineer (ML Operations)

whoop

Boston, Massachusetts, United States (On-Site)
1 Month ago
Luma - Research Scientist - Multimodal Language Models

Luma

Palo Alto, California, United States (Hybrid)
6 Months ago
Fireworks AI - Research Scientist, Reinforcement Learning

Fireworks AI

Redwood City, California, United States (On-Site)
1 Month ago
Scanline VFX - Research Scientist

Scanline VFX

Los Angeles, California, United States (Hybrid)
9 Months ago
Mozilla - Senior Machine Learning Engineering Manager, GenAI

Mozilla

Canada (Remote)
1 Month ago
Glean - Software Engineer, Machine Learning

Glean

Palo Alto, California, United States (On-Site)
3 Months ago
bytedance - Senior Software Engineer, AI Applications

bytedance

San Jose, California, United States (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Palo Alto, California, United States (Hybrid)

Palo Alto, California, United States (Hybrid)

Palo Alto, California, United States (Hybrid)

Palo Alto, California, United States (Hybrid)

Palo Alto, California, United States (Hybrid)

Palo Alto, California, United States (Hybrid)

Palo Alto, California, United States (Hybrid)

Palo Alto, California, United States (Hybrid)

Palo Alto, California, United States (Hybrid)

Palo Alto, California, United States (Hybrid)

View All Jobs

Get notified when new jobs are added by Luma

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug