LLM Ops Engineer

Yahoo

2+ Years | United States (Hybrid) | Full Time | 1 months ago

Apply Now

Job Summary

This role is for an LLM Ops Engineer passionate about Generative AI and its practical application within Yahoo. The team builds AI-driven experiences using large language models across major platforms. You will be responsible for the entire lifecycle of AI products, focusing on designing, implementing, and maintaining robust MLOps/LLM Ops pipelines. This includes continuous integration, delivery, and monitoring of AI models, enhancing evaluation frameworks, and systematically refining prompts for quality and safety. You will manage data lineage, collaborate with engineering teams, and optimize production AI services, making a significant impact on Yahoo's business and customers.

Must Have

Design, implement, and maintain robust MLOps/LLM Ops pipelines.
Maintain and enhance evaluation frameworks to benchmark new LLMs.
Systematically test and refine prompts to optimize for quality, relevance, safety, latency, and cost.
Implement and monitor systems for detecting and mitigating accuracy and safety issues.
Manage data lineage and versioning for training, validation, and evaluation datasets.
Collaborate with engineering teams to integrate and test AI functionalities.
Troubleshoot and optimize production AI services for latency, cost, and reliability.
Perform code reviews, maintain high code quality standards, and ensure proper documentation.
BS in Computer Science or a related field, or equivalent practical experience.
2+ years of professional software development experience.
1+ years of hands-on experience in AI/ML, with specific exposure to Large Language Models (LLMs) and Generative AI.
Strong programming proficiency, particularly in Python.
Experience with MLOps principles and tools (e.g., CI/CD pipelines, monitoring, automated testing).
Familiarity with major LLM platforms and APIs (e.g., VertexAI, OpenAI, AWS Bedrock, or open-source equivalents).
Solid understanding of software design patterns and distributed systems.

Good to Have

MS in Computer Science or a related field.
Some experience with on-device ML frameworks (e.g., Core ML, TensorFlow Lite, Mediapipe).
Familiarity with mobile (iOS/Android) or desktop (MacOS/Windows) development environments.
Knowledge of advanced prompt engineering and LLM fine-tuning techniques.
Experience with data orchestration tools (e.g., Airflow).
Familiarity with vector databases (e.g., Pinecone, Milvus) and embedding models.
Experience working in a fast-paced, agile environment.

Perks & Benefits

Healthcare
401k
Backup childcare
Education stipends
Flexible hybrid work options

Job Description

It takes powerful technology to connect our brands and partners with an audience of hundreds of millions of people. Whether you’re looking to write mobile app code, engineer the servers behind our massive ad tech stacks, or develop algorithms to help us process trillions of data points a day, what you do here will have a huge impact on our business—and the world.

A Little About Us

It takes powerful technology to redefine how hundreds of millions of people interact with the web. Our team is building the next generation of AI-driven experiences, integrating cutting-edge large language models (LLMs) to provide smarter, faster, and more personal access to information across all major platforms (iOS, Android, MacOS, and Windows). We are the team responsible for the AI systems that power this experience, ensuring they are robust, reliable, and responsible. What you do here will have a huge impact on our business and customers.

A Lot About You

You are an engineer passionate about the rapidly evolving field of Generative AI and its practical application. You understand that building an AI product isn't just about training a model; it's about the entire lifecycle. You have a keen eye for detail and a rigorous, data-driven approach to LLM evaluation. You enjoy the challenge of prompt optimization and understand the critical importance of managing data lineage, bias, and model selection. You are a collaborative problem-solver, eager to work in a cross-functional team to maintain and enhance the AI systems that define our browser experience for millions. You thrive in a fast-paced environment and are eager to make an impact.

Responsibilities

Design, implement, and maintain robust MLOps/LLM Ops pipelines for continuous integration, delivery, and monitoring of AI models using standard and custom evaluation tools (mainly in GCP)
Maintain and enhance evaluation frameworks to benchmark new LLMs (both cloud-based and on-device) for performance, accuracy, and efficiency.
Systematically test and refine prompts to optimize for quality, relevance, safety, latency and cost across diverse use cases.
Implement and monitor systems for detecting and mitigating accuracy and safety, ensuring our AI features remain safe and reliable over time.
Manage data lineage and versioning for training, validation, and evaluation datasets.
Collaborate with engineering teams (iOS, Android, Desktop) to integrate and test AI functionalities, including emerging on-device models.
Troubleshoot and optimize production AI services for latency, cost, and reliability.
Perform code reviews, maintain high code quality standards, and ensure proper documentation of systems.

Required Qualifications

BS in Computer Science or a related field, or equivalent practical experience.
2+ years of professional software development experience.
1+ years of hands-on experience in AI/ML, with specific exposure to Large Language Models (LLMs) and Generative AI.
Strong programming proficiency, particularly in Python.
You use AI coding tools as standard (this team is Claude Code for now)
Experience with MLOps principles and tools (e.g., CI/CD pipelines, monitoring, automated testing).
Familiarity with major LLM platforms and APIs (e.g., VertexAI, OpenAI, AWS Bedrock, or open-source equivalents).
Solid understanding of software design patterns and distributed systems.
Excellent problem-solving skills and a methodical approach to evaluation.

Preferred Qualifications

MS in Computer Science or a related field.
Some experience with on-device ML frameworks (e.g., Core ML, TensorFlow Lite, Mediapipe).
Familiarity with mobile (iOS/Android) or desktop (MacOS/Windows) development environments.
Knowledge of advanced prompt engineering and LLM fine-tuning techniques.
Experience with data orchestration tools (e.g., Airflow).
Familiarity with vector databases (e.g., Pinecone, Milvus) and embedding models.
Experience working in a fast-paced, agile environment

The material job duties and responsibilities of this role include those listed above as well as adhering to Yahoo policies; exercising sound judgment; working effectively, safely and inclusively with others; exhibiting trustworthiness and meeting expectations; and safeguarding business operations and brand integrity.

At Yahoo, we offer flexible hybrid work options that our employees love! While most roles don’t require regular office attendance, you may occasionally be asked to attend in-person events or team sessions. You’ll always get notice to make arrangements. Your recruiter will let you know if a specific job requires regular attendance at a Yahoo office or facility. If you have any questions about how this applies to the role, just ask the recruiter!

Yahoo is proud to be an equal opportunity workplace. All qualified applicants will receive consideration for employment without regard to, and will not be discriminated against based on age, race, gender, color, religion, national origin, sexual orientation, gender identity, veteran status, disability or any other protected category. Yahoo will consider for employment qualified applicants with criminal histories in a manner consistent with applicable law. Yahoo is dedicated to providing an accessible environment for all candidates during the application process and for employees during their employment. If you need accessibility assistance and/or a reasonable accommodation due to a disability, please submit a request via the Accommodation Request Form (www.yahooinc.com/careers/contact-us.html) or call +1.866.772.3182. Requests and calls received for non-disability related issues, such as following up on an application, will not receive a response.

We believe that a diverse and inclusive workplace strengthens Yahoo and deepens our relationships. When you support everyone to be their best selves, they spark discovery, innovation and creativity. Among other efforts, our 11 employee resource groups (ERGs) enhance a culture of belonging with programs, events and fellowship that help educate, support and create a workplace where all feel welcome.

The compensation for this position ranges from $88,500.00 - $184,375.00/yr and will vary depending on factors such as your location, skills and experience.The compensation package may also include incentive compensation opportunities in the form of discretionary annual bonus or commissions. Our comprehensive benefits include healthcare, a great 401k, backup childcare, education stipends and much (much) more.

Currently work for Yahoo? Please apply on our internal career site.