Senior Machine Learning Engineer, Computer Vision - Robotics

Scale AI

Job Summary

Scale’s Robotics business unit is dedicated to solving the data bottleneck in Physical AI. This position will be a key contributor in conducting applied research in Robotics and developing ML pipelines for training and fine-tuning on data collected by Scale. We are seeking an experienced Senior Machine Learning Engineer, Computer Vision to drive cutting-edge research and development in real-time and offline 2D and 3D algorithms. The successful candidate will be a hands-on technical leader responsible for translating complex computer vision algorithms from research papers into robust, production-ready systems that power our next-generation products. This role requires a deep theoretical background combined with substantial practical experience working with spatial and temporal data.

Must Have

  • Lead research, design, and implementation of novel 2D/3D computer vision and deep learning algorithms.
  • Drive innovation in 3D Reconstruction and SLAM.
  • Develop robust models for Hand/Body Tracking.
  • Design high-performance deep learning models for Object Detection and Tracking (MOT/SOT).
  • Create algorithms for Video Processing.
  • Optimize computationally intensive models for edge devices and/or cloud infrastructure.
  • Serve as subject matter expert in Computer Vision, providing technical direction and mentorship.
  • Maintain state-of-the-art knowledge, evaluate academic publications, and drive patents/research.
  • Collaborate with Software Engineering, Product, and Hardware teams.
  • Ph.D. in Computer Science or related field OR Master’s with 4+ years professional experience.
  • 5+ years hands-on experience in 2D/3D computer vision and deep learning algorithm development.
  • Expert proficiency in PyTorch, TensorFlow, or Jax.
  • Mastery of Python for machine learning.
  • Strong proficiency in C++ for performance-critical algorithm implementation.
  • In-depth knowledge of classical and modern computer vision fundamentals.
  • Experience building real-time and batch ML systems.
  • Experience rapidly prototyping and iterating on ML systems.

Perks & Benefits

  • Comprehensive health, dental and vision coverage
  • Retirement benefits
  • Learning and development stipend
  • Generous PTO
  • Commuter stipend (may be eligible)

Job Description

Scale’s Robotics business unit is dedicated to solving the data bottleneck in Physical AI. This position will be a key contributor in conducting applied research in Robotics and developing ML pipelines for training and fine-tuning on data collected by Scale. In this role, you will have the opportunity to advance Robotic research, shape Scale’s robotics offerings, and expand the frontier of Robotics data and model evaluation.

We are seeking an exceptionally motivated and experienced Senior Machine Learning Engineer, Computer Vision to drive cutting-edge research and development in real-time and offline 2D and 3D algorithms. The successful candidate will be a hands-on technical leader responsible for translating complex computer vision algorithms from research papers into robust, production-ready systems that power our next-generation products.

This role requires a deep theoretical background combined with substantial practical experience working with spatial and temporal data.

You will:

  • Pioneer Core CV Algorithms: Lead the research, design, and implementation of novel computer vision and deep learning algorithms, with a specialized focus on 2D and 3D data (e.g point clouds).
  • Focus Area Expertise: Drive innovation in key perception areas, including:
  • 3D Reconstruction and SLAM: Advanced techniques for real-time 3D mapping, pose estimation, and environmental modeling from multi-modal sensor inputs (e.g., RGB-D, LiDAR).
  • Hand/Body Tracking: Developing robust and precise models for hand pose estimation, gesture recognition, and full-body tracking under various lighting and occlusion conditions.
  • Object Detection and Tracking (MOT/SOT): Designing high-performance deep learning models for accurate detection and persistent tracking of objects and people in video streams.
  • Video Processing: Creating algorithms for temporal feature extraction, video-based action recognition, and motion analysis.
  • Model Optimization: Optimize computationally intensive models for deployment on edge devices (low power, low latency) and/or large-scale cloud infrastructure.
  • Technical Leadership: Serve as the subject matter expert in Computer Vision, providing technical direction and mentorship to junior engineers and cross-functional teams.
  • Publication & IP: Maintain state-of-the-art knowledge, evaluate recent academic publications (e.g., CVPR, ICCV, ECCV), and drive the filing of patents and publication of novel research.
  • Cross-Functional Partnering: Collaborate closely with Software Engineering, Product, and Hardware teams to define requirements, integrate vision systems, and ensure solutions meet performance targets.

You have:

  • Ph.D. in Computer Science, Computer Engineering, or a related quantitative field (Mathematics, Electrical Engineering, etc.) OR a Master’s degree with 4+ years of equivalent professional experience in an applied research setting.
  • 5+ years of hands-on experience in algorithm development for 2D/3D computer vision and deep learning.
  • Deep Learning Frameworks: Expert proficiency in at least one major deep learning framework (PyTorch, TensorFlow or Jax).
  • Programming: Mastery of Python for machine learning and strong proficiency in C++ for performance-critical algorithm implementation.
  • 2D/3D Fundamentals: In-depth knowledge of classical and modern computer vision fundamentals, including multi-view geometry, projective geometry, camera calibration, and 3D graphics/rendering principles.
  • Building real-time and batch ML systems that analyze structured and unstructured signals
  • Hands-on experience rapidly prototyping and iterating on ML systems with changing requirements

Nice to haves:

  • Deep Learning Frameworks: Expert proficiency in at least one major deep learning framework (PyTorch, TensorFlow or Jax).
  • Programming: Mastery of Python for machine learning and strong proficiency in C++ for performance-critical algorithm implementation.
  • 2D/3D Fundamentals: In-depth knowledge of classical and modern computer vision fundamentals, including multi-view geometry, projective geometry, camera calibration, and 3D graphics/rendering principles.
  • Building real-time and batch ML systems that analyze structured and unstructured signals
  • Hands-on experience rapidly prototyping and iterating on ML systems with changing requirements

14 Skills Required For This Role

Cross Functional Rendering Cpp Game Texts Lighting Shading Prototyping Object Detection Pytorch Deep Learning Computer Vision Python Algorithms Tensorflow Machine Learning

Similar Jobs