Senior Software Engineer, AI/ML Data Systems

5 Minutes ago • 5 Years +
Research Development

Job Description

Join us to design the core data systems powering both traditional machine learning and cutting-edge generative AI/LLM workflows. As a Senior Software Engineer, you’ll specialize in either Data & Feature Store Infrastructure or Labeling & Human Feedback Systems. You’ll work closely with ML engineers, MLOps, and product teams to deliver high-impact data and labeling solutions at scale, turning AI research into production-ready features that create real customer value.
Good To Have:
  • Experience with LLM pipelines, including embeddings, retrieval-augmented generation (RAG), or prompt engineering
  • Familiarity with labeling copilot tools, active learning, or managing hybrid annotation teams
  • Knowledge of knowledge graphs or semantic data modeling
Must Have:
  • Design and implement scalable feature engineering systems for batch and streaming computation
  • Build and maintain low-latency online feature serving systems
  • Develop and maintain monitoring systems for feature freshness, data drift, and data quality
  • Integrate feature management solutions with vector databases to support RAG workflows
  • Ensure compliance, lineage, and best practices for infrastructure as code
  • Build and scale annotation platforms for diverse data types (text, image, video, audio, 3D)
  • Develop workflows for LLM alignment, including instruction tuning and RLHF output ranking
  • Embed LLM-assisted labeling features such as auto-labeling, policy checking, and active learning
  • Drive annotation quality through inter-annotator agreement, gold standard samples, and anomaly detection
  • Manage and scale internal/external labeling teams while maintaining secure data integration
  • 5+ years of experience in data engineering, ML platform, or backend development roles
  • Proficiency in at least one modern programming language (Python preferred)
  • Experience developing and operating distributed backend APIs and SDKs
  • Experience working with cloud platforms (AWS, GCP, or Azure), containers (Docker/Kubernetes), and infrastructure-as-code tools (e.g., Terraform)
  • Specialization experience in Feature Store frameworks or Labeling platforms/workflows
Perks:
  • Competitive compensation package
  • Annual cash bonuses
  • Stock grants
  • Comprehensive benefits package

Add these skills to join the top 1% applicants for this job

game-texts
aws
azure
terraform
spark
reinforcement-learning
docker
kubernetes
python
machine-learning

Position Overview

Join us to design the core data systems powering both traditional machine learning and cutting-edge generative AI/LLM workflows. As a Senior/Principal Software Engineer, you’ll specialize in one of two tracks:

  • Data & Feature Store Infrastructure: Build scalable backend systems for data ingestion, batch/streaming ETL pipelines, feature stores, vector-enabled APIs, and data compliance
  • Labeling & Human Feedback Systems: Design multimodal annotation platforms (text, image, audio, video, 3D), develop RLHF workflows (instruction tuning, output ranking), and drive LLM-assisted labeling innovations

You’ll work closely with ML engineers, MLOps, and product teams to deliver high-impact data and labeling solutions at scale. Reporting to the Head of AI & ML Platform, you’ll turn AI research into production-ready features that create real customer value.

Responsibilities

Choose one track to focus on:

Data & Feature Store Infrastructure

  • Design and implement scalable feature engineering systems for both batch and streaming computation
  • Build and maintain low-latency online feature serving systems with consistency between training and inference
  • Develop and maintain monitoring systems for feature freshness, data drift, and data quality
  • Integrate feature management solutions with vector databases to support embeddings and retrieval-augmented generation (RAG) workflows
  • Ensure compliance, lineage, and best practices for infrastructure as code

Labeling & Human Feedback Systems

  • Build and scale annotation platforms for diverse data types: text, image, video, audio, and 3D
  • Develop workflows for LLM alignment, including instruction tuning and RLHF (Reinforcement Learning from Human Feedback) output ranking
  • Embed LLM-assisted labeling features such as auto-labeling, policy checking, and active learning
  • Drive annotation quality through processes such as inter-annotator agreement, gold standard samples, and anomaly detection
  • Manage and scale internal/external labeling teams while maintaining secure data integration

Minimum Qualifications

  • 5+ years of experience in data engineering, ML platform, or backend development roles
  • Proficiency in at least one modern programming language (Python preferred)
  • Experience developing and operating distributed backend APIs and SDKs
  • Experience working with cloud platforms (AWS, GCP, or Azure), containers (Docker/Kubernetes), and infrastructure-as-code tools (e.g., Terraform)

Plus, one of the following specialization experiences:

Feature Store Track: (At least have experience with TWO of the following)

  • Hands-on experience with feature store frameworks (e.g., SageMaker Feature Store, Feast, Tecton, Hopsworks), or operating vector database systems for serving LLM use cases
  • Experience with batch and/or streaming data pipelines (e.g., Kafka, Flink, Spark, Ray) and orchestration tools (e.g., Airflow, Argo Workflow)
  • Demonstrated experience at least in one the data areas: data catalog, data validation, versioning, lineage, and security/compliance

Labeling Track: (At least have experience with ONE of the following)

  • Proven working experience with labeling platforms (e.g., GroundTruth, Label Studio)
  • RLHF/instruction tuning, or annotation workflow development

Preferred Qualifications

  • Experience with LLM pipelines, including embeddings, retrieval-augmented generation (RAG), or prompt engineering
  • Familiarity with labeling copilot tools, active learning, or managing hybrid annotation teams
  • Knowledge of knowledge graphs or semantic data modeling

Set alerts for more jobs like Senior Software Engineer, AI/ML Data Systems
Set alerts for new jobs by Autodesk
Set alerts for new Research Development jobs in Canada
Set alerts for new jobs in Canada
Set alerts for Research Development (Remote) jobs

Contact Us
hello@outscal.com
Made in INDIA 💛💙