Senior Machine Learning Engineer

30 Minutes ago • 7-10 Years

Research Development

Job Description

Build and operate the production backbone for ML services, taking models from Applied Sciences to deliver reliable, low-latency services across Tekion’s automotive platform. This role involves owning pipelines, microservices, CI/CD, observability, and runtime reliability, collaborating with Applied Sciences and Product to drive measurable impact for dealers and consumers. Focus on accelerating LLM-powered features, operationalizing secure and scalable LLM/agentic services, and standardizing model deployment and monitoring.

Must Have:

Turn Applied Sciences prototype models into fast, reliable services.
Integrate with LLM Gateway/MCP and manage prompt/config versioning.
Build and orchestrate CI/CD pipelines.
Review, refactor, optimize, containerize, deploy, and monitor data science models.
Design enterprise systems in collaboration with data scientists, engineers, and product managers.
Monitor, detect, and mitigate risks unique to LLMs and agentic systems.
Implement prompt management including versioning, A/B testing, and guardrails.
Design batch/stream pipelines and online features.
Build inference microservices with schema versioning and stringent latency targets.
Manage the full ML model and feature lifecycle.
Implement robust observability and real-time reliability for ML systems.
Develop templates, SDKs, CLIs, sandbox datasets, and documentation.
7-10 years in ML engineering or production backend/platform engineering.
Proficiency in Python and one of Java/Go/Scala.
Experience with LLMs, vector stores, and graph/knowledge stores.
Hands-on with ML orchestration frameworks (e.g., LangChain).
Strong experience with Docker, Kubernetes, and cloud platforms (AWS preferred).
Expertise in data processing technologies (Spark/Flink, Kafka).
Knowledge of security and compliance in ML systems.

Perks:

Competitive compensation
Generous stock options
Medical Insurance coverage
Work with some of the brightest minds from Silicon Valley’s most dominant and successful companies

Add these skills to join the top 1% applicants for this job

problem-solving

forecasting-budgeting

data-structures

game-texts

alphabeta-testing

aws

prometheus

grafana

spark

data-science

ci-cd

docker

microservices

kubernetes

python

scala

java

machine-learning

Position Summary

Build and operate the production backbone that takes models from Applied Sciences (AS) and delivers reliable, low-latency ML services across Tekion’s DMS, CRM, Digital Retail, Service, Payments, and enterprise products. You’ll own pipelines, microservices, CI/CD, observability, and runtime reliability—working hand-in-hand with Applied Sciences and Product to turn ideas into measurable dealer and consumer impact

Why this Role Matters

Accelerate the rollout of LLM-powered and agent-driven features across Tekion products.
Enable agentic workflows that automate, reason, and interact on behalf of users and internal stakeholders.
Operationalize secure, compliant, and explainable LLM and agentic services at scale.
Convert Applied Sciences models into scalable, compliant, cost‑efficient production services.
Standardize how models are trained, validated, deployed, and monitored across Tekion products.
Power real-time, context-aware experiences by integrating batch/stream features, graph context, and online inference.

What You’ll Do

Turn Applied Sciences prototype models (tabular, NLP/LLM, recommendation, forecasting) into fast, reliable services with well-defined API contracts.
Integrate with the LLM Gateway/MCP, prompt/config versioning.
Build and orchestrate CI/CD pipelines.
Review data science models; refactor and optimize code; containerize; deploy; version; and monitor for quality.
Collaborate with data scientists, data engineers, product managers, and architects to design enterprise systems.
Monitor, detect, and mitigate risks unique to LLMs and agentic systems.
Implement prompt management: versioning, A/B testing, guardrails, and dynamic orchestration based on feedback and metrics.
Design batch/stream pipelines (Airflow/Kubeflow, Spark/Flink, Kafka) and online features linked to our domain graph.
Build inference microservices (REST/gRPC) with schema versioning, structured outputs, and stringent p95 latency targets.
Manage the model/feature lifecycle: feature store strategy, model/agent registry, versioning, and lineage.
Instrument deep observability: traces/logs/metrics, data/feature drift, model performance, safety signals, and cost tracking.
Ensure real-time reliability: autoscaling, caching, circuit breakers, retries/fallbacks, and graceful degradation.
Develop templates/SDKs/CLIs, sandbox datasets, and documentation that make shipping ML the default path.

Desired Skills and Experience

7 - 10 years in ML engineering/MLOps or backend/platform engineering with production ML.
Experience with LLMs, retrieval systems, vector stores, and graph/knowledge stores.
Strong software engineering fundamentals: Python plus one of Java/Go/Scala; API design; concurrency; testing.
Hands-on with orchestration frameworks and libraries (LangChain, LlamaIndex, OpenAI Function Calling, AgentKit, etc.).
Knowledge of agent architectures (reactive, planning, retrieval-augmented agents), and safe execution patterns.
Pipelines and data: Airflow/Kubeflow or similar; Spark/Flink; Kafka/Kinesis; strong data quality practices.
Microservices and runtime: Docker/Kubernetes, service meshes, REST/gRPC; performance and reliability engineering.
Model ops: experiment tracking, registries (e.g., MLflow), feature stores, A/B and shadow testing, drift detection.
Observability: OpenTelemetry/Prometheus/Grafana; debugging latency, tail behavior, and memory/CPU hotspots.
Cloud: AWS preferred (IAM, ECS/EKS, S3, RDS/DynamoDB, Step Functions/Lambda), with cost optimization experience.
Security/compliance: secrets management, RBAC/ABAC, PII handling, auditability.

Preferred Mindset

Product-oriented: You measure success by dealer and consumer outcomes, not just technical metrics.
Reliability- and safety-first: You move fast with guardrails, rollbacks, and clear SLOs.
Systems thinker: You design for multi-tenant scale, portability, and cost efficiency.
Collaborative: You translate between Applied Sciences, Product, and the Data & AI Platform; you document and teach.
Pragmatic: You automate the 80% and leave room for rapid experimentation

Perks and Benefits

Competitive compensation
Generous stock options
Medical Insurance coverage
Work with some of the brightest minds from Silicon Valley’s most dominant and successful companies

Tekion is proud to be an Equal Employment Opportunity employer. We do not discriminate based upon race, religion, color, national origin, gender (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, victim of violence or having a family member who is a victim of violence, the intersectionality of two or more protected categories, or other applicable legally protected characteristics.

For more information on our privacy practices, please refer to our Applicant Privacy Notice here.

Create a Job Alert

Interested in building your career at Tekion? Get future opportunities sent straight to your email.

Create alert

Set alerts for more jobs like Senior Machine Learning Engineer

Set alerts for new jobs by Tekion Corp

Set alerts for new Research Development jobs in India

Set alerts for new jobs in India

Set alerts for Research Development (Remote) jobs

More Research Development Jobs

Senior Machine Learning Platform Engineer

Tekion Corp

Bangalore, Karnataka, India (On-site)

Senior Machine Learning Engineer

Job Summary

Job Description

18 skills required for this role

Job Details

Position Summary

Perks and Benefits

Job Alerts

Similar Jobs

More Research Development Jobs

Senior Machine Learning Platform Engineer

Senior Machine Learning Engineer – AI Transformation & Technology (CDP)

Software Engineer - AI Developer Workflows

ML Research Scientist I/II, Multimodal Data Extraction

Research Associate I, Characterization

Research Scientist I/II, In Silico Materials Discovery

Machine Learning Scientist I/II, Automated Image Analysis

Machine Learning Scientist I/II, Materials Performance Modeling

Machine Learning Scientist I/II, Scientific Reasoning

Machine Learning Scientist I/II, Medicinal Chemistry & Lead Optimization

More Software Development & Engineering Jobs

Staff Business Applications Engineer (Salesforce Lead - CPQ/Billing)

Staff Business Applications Engineer (Salesforce Lead - Sales Cloud)

Architect Software Engineer – Backend

Software Engineer II

Senior Software Engineer

Software Engineer Intern

Senior Web Software Engineer

Java Engineer

Staff Software Development Engineer

Mechatronics Engineer II

Tekion Corp

Partner and OEM Account Director

Staff Business Applications Engineer (Salesforce Lead - Sales Cloud)

Staff Business Applications Engineer (Salesforce Lead - CPQ/Billing)

Senior QA Engineer

Senior Director, Data Analytics

Manager, Security Engineering (Cloud Security)

Architect Software Engineer – Backend

Software Engineer II

Senior Machine Learning Platform Engineer

Level Up Your Career in Game Development!