Agentic Analyst

5 Minutes ago • 2-4 Years • $37,440 PA - $49,920 PA
Business Analysis

Job Description

LILT is seeking an Agentic Analyst to join its BizOps-led Agentic Experimentation function. This role involves designing and running rapid experiments to enhance AI agents and production workflows. The analyst will convert business problems into structured experiments, script data pulls, iterate prompts, evaluate quality versus cost, and publish findings to guide product and workflow decisions across various AI variants.
Good To Have:
  • Experience evaluating LLM systems with AI judges, scripted checks, or human sampling.
  • Familiarity with MQM/LQE or similar linguistics QA frameworks.
  • Knowledge of RAG, vector stores, and retrieval verification strategies.
  • Familiarity with agentic workflows or translation/linguistics domains.
  • Basic MLOps/experimentation tools and prompt versioning best practices.
Must Have:
  • Design and execute end-to-end agentic experiments.
  • Script data pulls, iterate prompts, and analyze quality vs. cost.
  • Publish clear findings to inform product and workflow decisions.
  • Develop and optimize prompts for LILT’s AI agents and LLM workflows.
  • Automate data processing and experiment execution using Python.
  • Evaluate outputs with AI, programmatic checks, and human feedback.
  • Partner with Production to quantify workflow tradeoffs.
  • Communicate findings crisply and share reproducible artifacts.
  • Contribute to prompt and experiment hygiene.
  • 2-4 years in Data Science, Analytics, or Applied AI with Python.
  • Hands-on experience with LLMs and prompt engineering.
  • Strong analytical rigor for production settings.
  • Proficiency in SQL and cloud data warehouses like BigQuery.
  • Clear written communication for executive summaries.

Add these skills to join the top 1% applicants for this job

data-analytics
github
game-texts
quality-control
salesforce
machine-translation
data-visualization
looker
data-science
pandas
notion
canva
python
sql
jira
machine-learning

About LILT

AI is changing how the world communicates — and LILT is leading that transformation.

We're on a mission to make the world's information accessible to everyone, regardless of the language they speak. We use cutting-edge AI, machine translation, and human-in-the-loop expertise to translate content faster, more accurately, and more cost-effectively without compromising on brand, voice, or quality.

At LILT, we empower our teammates with leading tools, global collaboration, and growth opportunities to do their best work. Our company virtues—Work together, win together; Find a way or make one; Quicker than they expect; Quality is Job 1—guide everything we do. We are trusted by Intel Corporation, Canva, the United States Department of Defense, the United States Air Force, ASICS, and hundreds of global Enterprises. Backed by Sequoia, Intel Capital, and Redpoint, we’re building a category-defining company in a $50B+ global translation market being redefined by AI.

Contract Data Analyst (LLMs, Prompt Engineering) — Agentic Experimentation

About the role

LILT’s BizOps-led Agentic Experimentation function designs and runs rapid experiments that upgrade our AI agents and production workflows, working closely with Production, Product, Engineering, and Research. You will turn business problems into structured experiments, script data pulls, iterate prompts, evaluate quality vs. cost, and publish clear findings that inform product and workflow decisions across AI Review, AI QA, Human‑Optimized, and Workflow 5 variants.

What you’ll do

  • Design and run agentic experiments end to end using our standard process:
  • Frame the problem and success criteria
  • Script data pulls from BigQuery, assemble representative datasets
  • Integrate and iterate prompts in code
  • Execute runs, collect outputs, and perform cost/quality analysis
  • Summarize results with examples, metrics, and recommendations for rollout or follow‑ups
  • Develop, version, and optimize prompts for LILT’s agents and LLM-backed workflows, leveraging our model-agnostic stack and evaluation practices
  • Write robust Python to automate data processing, experiment execution, and result aggregation; use notebooks and lightweight scripts where appropriate
  • Evaluate outputs with a mix of AI and programmatic checks aligned to production realities, including error detection, terminology/style adherence, and “human-in-the-loop” checkpoints
  • Partner with Production on workflow trials across multiple configurations; quantify tradeoffs in quality, speed, and cost
  • Communicate findings crisply in docs and presentations; open or update Jira tickets and share reproducible artifacts (datasets, scripts, prompts, and dashboards)
  • Contribute to shared prompt and experiment hygiene: versioning, datasets, eval suites, and guardrails to prevent regressions
  • Test agent capabilities in our sandbox environment and provide structured feedback to Product/Engineering

Must-haves

  • 2–4 years in Data Science, Analytics, or Applied AI with demonstrable Python proficiency (pandas, data parsing, APIs, basic ETL)
  • Hands-on experience with LLMs and prompt engineering across providers (e.g., OpenAI, Anthropic, Vertex/Gemini, Bedrock), including practical eval and iteration cycles
  • Strong analytical rigor: can define success metrics, compare workflows, and reason clearly about quality/cost/speed tradeoffs in production settings
  • SQL and data wrangling skills; experience with BigQuery or equivalent cloud data warehouse
  • Clear written communication with exec-ready summaries and artifact links (reports, notebooks, Sheets, slides)

Nice-to-haves

  • Experience evaluating LLM systems with AI judges, scripted checks, or human sampling; familiarity with MQM/LQE or similar linguistics QA frameworks
  • Knowledge of RAG, vector stores, and retrieval verification strategies
  • Familiarity with agentic workflows or translation/linguistics domains
  • Basic MLOps/experimentation tools and prompt versioning best practices

Tooling and environment at LILT

  • Primary: Python, BigQuery, notebooks/Colab, Google Drive; Jira for work tracking; Notion for experiment documentation
  • Model orchestration is provider-agnostic; AI Review may route to different backends (e.g., Claude via Bedrock), so comfort with switching models and comparing output is key
  • Dashboarding: for new reporting, follow the LILT Dashboarding/Data Visualization Policy. Preference is LILT Analytics and approved internal platforms over new Looker Studio reports

Ways of working

  • You’ll sit with BizOps’ Agentic Experimentation team and partner closely with Production, Product, and Research to scope, execute, and land changes in real workflows and product
  • Operate within our experiment playbook: short cycles, reproducibility, realistic data, and decision‑ready summaries

Why LILT

  • Be part of the team advancing agentic AI in enterprise translation, improving real production outcomes across quality, speed, and cost
  • Work with a company founded and led by AI researchers, shipping model‑agnostic, production‑grade agent capabilities to customers at global scale

How to apply

If you’re excited to build and evaluate agentic workflows with measurable impact, we’d love to hear from you.

Our Story

Our founders, Spence and John met at Google working on Google Translate. As researchers at Stanford and Berkeley, they both worked on language technology to make information accessible to everyone. While together at Google, they were amazed to learn that Google Translate wasn’t used for enterprise products and services inside the company.The quality just wasn’t there. So they set out to build something better. LILT was born.

LILT has been a machine learning company since its founding in 2015. At the time, machine translation didn’t meet the quality standard for enterprise translations, so LILT assembled a cutting-edge research team tasked with closing that gap. While meeting customer demand for translation services, LILT has prioritized investments in Large Language Models, human-in-the-loop systems, and now agentic AI.

With AI innovation accelerating and enterprise demand growing, the next phase of LILT’s journey is just beginning.

Our Tech

What sets our platform apart:

  • Brand-aware AI that learns your voice, tone, and terminology to ensure every translation is accurate and consistent
  • Agentic AI workflows that automate the entire translation process from content ingestion to quality review to publishing
  • 100+ native integrations with systems like Adobe Experience Manager, Webflow, Salesforce, GitHub, and Google Drive to simplify content translation
  • Human-in-the-loop reviews via our global network of professional linguists, for high-impact content that requires expert review

LILT in the News

Set alerts for more jobs like Agentic Analyst
Set alerts for new jobs by Lilt
Set alerts for Business Analysis (Remote) jobs

Contact Us
hello@outscal.com
Made in INDIA 💛💙