Research Scientist - AI Infrastructure PhD 2026

undefined ago • All levels • Research Development • $136,800 PA - $259,200 PA

Job Summary

Job Description

As a Research Scientist on the AI Infra Team at ByteDance, you will design and optimize robust, scalable infrastructure for cutting-edge AI/ML initiatives. You'll collaborate with researchers and engineers to create high-performance environments for training, inference, and data processing. Your expertise in areas like infrastructure design, performance optimization, distributed systems, and data pipelines will be crucial in enabling the next generation of AI-driven products and services, pushing the state of the art in ML infrastructure.
Must have:
  • Lead end-to-end design of scalable, reliable AI infrastructure.
  • Define and implement service-oriented, containerized architectures.
  • Profile and optimize every layer of the ML stack.
  • Develop low-overhead telemetry and benchmarking frameworks.
  • Build and operate large-scale deployment and orchestration systems.
  • Champion fault-tolerance, high availability, and cost-efficiency.
  • Architect and implement robust ETL and data ingestion pipelines.
  • Integrate experiment management and workflow orchestration tools.
  • Partner with ML researchers to translate prototype requirements.
  • Mentor and coach engineers on best practices.
  • PhD in Computer Science, Engineering, or related field (2026 graduation).
  • Understanding of infrastructure or systems engineering with ML/AI infrastructure.
  • Strong programming skills in Python, C++, Go, or Rust.
  • Excellent communication skills.
  • Strong problem-solving aptitude.
Perks:
  • Medical, dental, and vision insurance
  • 401(k) savings plan with company match
  • Paid parental leave
  • Short-term and long-term disability coverage
  • Life insurance
  • Wellbeing benefits
  • 10 paid holidays per year
  • 10 paid sick days per year
  • 17 days of Paid Personal Time

Job Details

Responsibilities

We are looking for talented individuals to join our team in 2026. As a graduate, you will get opportunities to pursue bold ideas, tackle complex challenges, and unlock limitless growth. Launch your career where inspiration is infinite at ByteDance. Successful candidates must be able to commit to an onboarding date by end of year 2026. Please state your availability and graduation date clearly in your resume. On the AI Infra Team, you'll be immersed in the robust and scalable infrastructure that powers our cutting-edge artificial intelligence (AI) and machine learning (ML) initiatives. You will work closely with our AI/ML researchers, data scientists, and software engineers to create an efficient, high-performance environment for training, inference, and data processing. Your expertise will be critical in enabling the next generation of AI-driven products and services. The ideal candidate should be an expert in at least one of the following fields to define and design the next-gen AI Infrastructure:

  • Infrastructure Design & Architecture
  • Lead end-to-end design of scalable, reliable AI infrastructure (AI accelerators, compute clusters, storage, networking) for training and serving large ML workloads.
  • Define and implement service-oriented, containerized architectures (Kubernetes, VM frameworks, unikernels) optimized for ML performance and security.
  • Performance Optimization
  • Profile and optimize every layer of the ML stack—ML Compiler, GPU/TPU scheduling, NCCL/RDMA networking, data preprocessing, and training/inference frameworks.
  • Develop low-overhead telemetry and benchmarking frameworks to identify and eliminate bottlenecks in distributed training and serving.
  • Distributed Systems & Scalability
  • Build and operate large-scale deployment and orchestration systems that auto-scale across multiple data centers (on-premises and cloud).
  • Champion fault-tolerance, high availability, and cost-efficiency through smart resource management and workload placement.
  • Data Pipeline & Workflow Engineering
  • Architect and implement robust ETL and data ingestion pipelines (Spark/Beam/Dask/Flume) tailored for petabyte-scale ML datasets.
  • Integrate experiment management and workflow orchestration tools (Airflow, Kubeflow, Metaflow) to streamline research-to-production.
  • Collaboration & Mentorship
  • Partner with ML researchers to translate prototype requirements into production-grade systems.
  • Mentor and coach engineers on best practices in performance tuning, systems design, and reliability engineering.

Qualifications

Minimum Qualifications

  • Graduation date in 2026 year with a PhD in Computer Science, Engineering, or a related technical field.
  • Understanding of infrastructure or systems engineering focused roles, with ML/AI infrastructure.
  • Strong programming skills in Python, C++, Go, or Rust for systems development and automation.
  • Excellent communicator able to bridge research and production teams.
  • Strong problem-solving aptitude and a drive to push the state of the art in ML infrastructure.

Similar Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Similar Skill Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Jobs in San Jose, California, United States

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Research Development Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

About The Company

Founded in 2012, ByteDance's mission is to inspire creativity and enrich life. With a suite of more than a dozen products, including TikTok as well as platforms specific to the China market, including Toutiao, Douyin, and Xigua, ByteDance has made it easier and more fun for people to connect with, consume, and create content.
View All Jobs

Get notified when new jobs are added by bytedance

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug