Machine Learning Operation Engineer

Morning Star

3+ Years | Mumbai, Maharashtra, India (On Site) | Full Time | 1 day ago

Apply Now

Job Summary

As a Machine Learning Operations Engineer, you will develop and maintain cutting-edge systems for AI products, focusing on production-grade ML infrastructure like inference endpoints, orchestration, data pipelines, and scalable APIs. You will apply a software development mindset to MLOps, ensuring testing, monitoring, documentation, and reliability, while understanding machine learning principles and LLM production trade-offs. Responsibilities include building and scaling inference endpoints, developing CI/CD pipelines on AWS, and integrating various data technologies.

Must Have

Develop and maintain cutting-edge systems for AI products
Design, deploy, and scale production-grade ML infrastructure
Build and scale inference endpoints and APIs for ML models and LLMs
Develop CI/CD pipelines and automate deployment on AWS
Design and maintain data pipelines, queues, and event-driven workflows
Integrate vector databases, MCP servers, and retrieval pipelines
Contribute to Python microservices and support the orchestrator layer
Ensure monitoring, observability, and cost-aware operation of ML services
Strong programming skills in Python
3+ years experience in MLOps, backend, or data engineering
Good knowledge of ML principles
Solid knowledge of AWS services
Experience with CI/CD pipelines, Docker/Kubernetes
Understanding of microservices architectures, queues/events, scalability
Experience with SQL databases (PostgreSQL)
Good communication skills and a product-first mindset

Good to Have

Hands-on experience deploying and operating LLMs in production
Experience with JavaScript/TypeScript
Experience with Harness
Familiarity with retrieval-augmented generation (RAG), vector DBs
Monitoring/observability tools (CloudWatch, Prometheus, Grafana)
Infrastructure-as-code (Terraform, Cloudformation)
Experience with web crawlers or large-scale data ingestion

Perks & Benefits

Hybrid work environment (four days in-office each week in most locations)
Tools and resources for global collaboration
Range of other benefits to enhance flexibility

Job Description

About the Role

As a Machine Learning Operations Engineer, you will be responsible for developing and maintaining the cutting edge systems that bring our AI products to life.

You will design, deploy, and scale the systems that power our AI products, enabling investors worldwide to assess the Environmental, Social, and Governance (ESG) performance of companies. Your focus will be on production-grade ML infrastructure: inference endpoints, orchestration, data pipelines, and scalable APIs.

We are looking for engineers who bring a software development mindset into MLOps — testing, monitoring, documentation, and reliability — while also understanding machine learning principles and LLMs in production trade-offs.

Responsibilities

Build and scale inference endpoints and APIs for both classic ML models and LLMs.
Develop CI/CD pipelines and automate deployment on AWS (Bedrock, Lambda, EKS, S3, etc).
Design and maintain data pipelines, queues, and event-driven workflows.
Integrate vector databases, MCP servers, and retrieval pipelines into production systems.
Contribute to microservices in Python and support our orchestrator layer.
Ensure monitoring, observability, and cost-aware operation of deployed ML services.
Collaborate with AI researchers and software engineers to productize prototypes.

Qualifications

Strong programming skills in Python (APIs, pipelines, services).
3+ years experience in MLOps, backend engineering, data engineering or related roles.
Good knowledge of ML principles (e.g. precision, recall, inference time, latency/throughput trade-offs).
Solid knowledge of AWS services (Bedrock, Lambda, EKS, S3, etc).
Experience with CI/CD pipelines, containerization (Docker/Kubernetes).
Understanding of microservices architectures, queues/events, and scalability.
Experience with SQL databases (PostgreSQL).
Good communication skills and a product-first mindset.

Nice to Have

Hands-on experience deploying and operating LLMs in production, with awareness of limitations, evaluation, and cost implications.
Experience with JavaScript/TypeScript
Experience with Harness
Familiarity with retrieval-augmented generation (RAG), vector DBs.
Monitoring/observability tools (CloudWatch, Prometheus, Grafana).
Infrastructure-as-code (Terraform, Cloudformation).
Experience with web crawlers or large-scale data ingestion.

Morningstar is an equal opportunity employer

Morningstar's hybrid work environment gives you the opportunity to collaborate in-person each week as we've found that we're at our best when we're purposely together on a regular basis. In most of our locations, our hybrid work model is four days in-office each week. A range of other benefits are also available to enhance flexibility as needs change. No matter where you are, you'll have tools and resources to engage meaningfully with your global colleagues.

I10_MstarIndiaPvtLtd Morningstar India Private Ltd. (Delhi) Legal Entity

17 Skills Required For This Role

Communication Game Texts Postgresql Prototyping Aws Prometheus Terraform Grafana Ci Cd Docker Microservices Kubernetes Python Sql Typescript Javascript Machine Learning

Similar Jobs

Research Development

ML Engineer - Personalization & Recommendation Systems

krea.ai • San Francisco, California, United States (On Site)