ML Ops Engineer

Stord

3+ Years | Remote | Full Time | 1 day ago

Apply Now

Job Summary

Stord is seeking an ML Operations Bridge Engineer to join its newly formed AI team, focusing on building cutting-edge ML features. This role bridges data science and production engineering, involving the deployment of models to production APIs, building real-time feature engineering pipelines, and developing CI/CD for model deployment. The engineer will work on critical features like delivery time estimation and demand forecasting, with the freedom to shape MLOps practices and tooling.

Must Have

Deploy trained ML models to production platforms like Modal.com or Vertex AI
Build APIs that serve model predictions with low latency
Implement A/B testing frameworks for model comparison
Create model versioning and rollback strategies
Monitor model performance and detect drift
Build real-time feature engineering pipelines using Kafka Streams or similar tooling
Create data validation and quality monitoring systems
Design feature stores for both training and inference
Implement efficient data transformations from Postgres/AlloyDB sources
Develop CI/CD pipelines for model deployment
Build monitoring dashboards for model and pipeline health
Optimize inference costs across cloud platforms
Create developer tools for ML features integration
Document and evangelize MLOps best practices
Partner with Data Scientists to understand model requirements
Collaborate with platform engineers to integrate ML features
Work with product teams to define success metrics

Good to Have

Experience with Modal.com, Vertex AI, or similar ML platforms
Kafka or other streaming data platform experience
Familiarity with Elixir or functional programming
Knowledge of logistics, e-commerce, or supply chain domains
Experience with Cloudflare Workers or edge computing
Contributions to open source ML/data tools
Experience with feature stores (Feast, Tecton)
Container orchestration (Kubernetes)

Job Description

Stord is The Consumer Experience Company, powering seamless checkout through delivery for today's leading brands. Stord is rapidly growing and is on track to double our revenue in the next 18 months. To meet and exceed this target, Stord is strategically scaling teams across the entire company, and seeking energetic experts to help us achieve our mission.

By combining comprehensive commerce-enablement technology with high-volume fulfillment services, Stord provides brands a platform to compete with retail giants. Stord manages over $10 billion of commerce annually through its fulfillment, warehousing, transportation, and operator-built software suite including OMS, Pre- and Post-Purchase, and WMS platforms. Stord is leveling the playing field for all brands to deliver the best consumer experience at scale.

With Stord, brands can increase cart conversion, improve unit economics, and drive sustained customer loyalty. Stord’s end-to-end commerce solutions combine best-in-class omnichannel fulfillment and shipping with leading technology to ensure fast shipping, reliable delivery promises, easy access to more channels, and improved margins on every order.

Hundreds of leading DTC and B2B companies like AG1, True Classic, Native, Seed Health, quip, goodr, Sundays for Dogs, and more trust Stord to deliver industry-leading consumer experiences on every order. Stord is headquartered in Atlanta with facilities across the United States, Canada, and Europe. Stord is backed by top-tier investors including Kleiner Perkins, Franklin Templeton, Founders Fund, Strike Capital, Baillie Gifford, and Salesforce Ventures.

Stord is revolutionizing the logistics industry with our cloud-based supply chain platform. Our newly formed AI team is building cutting-edge features that leverage both traditional ML models (deployed on Modal.com) and LLM capabilities (via Cloudflare Workers AI, Vertex AI, and direct provider integrations). We need someone who can bridge the gap between data science and production engineering to help us ship ML features rapidly and reliably.

We are seeking a skilled ML Operations Bridge Engineer who thrives at the intersection of data science and software engineering. You'll work directly with our Senior Data Scientist to take models from Jupyter notebooks to production APIs serving millions of predictions daily. This is a hands-on role where you'll build data pipelines, deploy models, create monitoring systems, and ensure our ML features deliver real business value.

In this role, you'll be instrumental in building our ML infrastructure from the ground up. You'll work on critical features like delivery time estimation, demand forecasting, and AI-powered insights, with the freedom to shape our MLOps practices and tooling choices. This is a unique opportunity to have massive impact on a small, ambitious team.

What You'll Do:

ML Operations & Deployment:

Take trained models from our Data Scientist and deploy them to Modal.com or Vertex AI
Build TypeScript (or Python or Elixir – the best tool for the job) APIs that serve model predictions with <100ms latency
Implement A/B testing frameworks for model comparison
Create model versioning and rollback strategies
Monitor model performance and catch drift before customers notice

Data Pipeline Development:

Build real-time feature engineering pipelines using Kafka Streams or similar tooling
Create data validation and quality monitoring systems
Design feature stores that serve both training and inference
Implement efficient data transformations from our Postgres/AlloyDB sources
Ensure data consistency across our ML and production systems

Infrastructure & Integration:

Develop CI/CD pipelines for model deployment
Build monitoring dashboards for model and pipeline health
Optimize inference costs across Modal, Cloudflare, and GCP
Create developer tools that make ML features easy to integrate
Document and evangelize MLOps best practices

Cross-functional Collaboration:

Partner with the Data Scientist to understand model requirements
Work with platform engineers to integrate ML features into our core Elixir services
Collaborate with product teams to define success metrics
Help other engineers understand and use ML capabilities

What You'll Need:

Strong Python (3+ years) - You've shipped ML models to production, not just notebooks
Strong TypeScript/JavaScript (2+ years) - You can build robust APIs and understand async patterns
MLOps Experience - You've deployed models, built pipelines, and monitored performance
Data Engineering - Experience with streaming data, ETL/ELT, and data quality
Cloud Platforms - Hands-on experience with GCP, AWS, or Azure (we’re on GCP)
Version Control - Expert with Git/GitHub and collaborative workflows
SQL Proficiency - Can write complex queries and optimize performance
Production Mindset - You care more about customer impact than perfect code
Pragmatic Approach - You know when to use simple solutions vs complex ones
Strong Communication - Can explain technical decisions to various audiences
Self-Directed - You identify what needs doing without detailed specs
Learning Agility - Comfortable picking up new tools and technologies quickly

Preferred Qualifications:

Experience with Modal.com, Vertex AI, or similar ML platforms
Kafka or other streaming data platform experience
Familiarity with Elixir or functional programming
Knowledge of logistics, e-commerce, or supply chain domains
Experience with Cloudflare Workers or edge computing
Contributions to open source ML/data tools
Experience with feature stores (Feast, Tecton)
Container orchestration (Kubernetes) kn

18 Skills Required For This Role

Cross Functional Forecasting Budgeting Github Game Texts Cross Functional Collaboration Alphabeta Testing Salesforce Aws Azure Model Deployment Data Science Ci Cd Kubernetes Git Python Sql Typescript Javascript

Similar Jobs