Research Engineer / Scientist -AI for Databases

bytedance

Job Summary

The Infrastructure System Lab at ByteDance is seeking a Research Engineer/Scientist for its AI4DB team. This role involves designing and implementing intelligent systems to enhance database performance, scalability, and usability, focusing on query optimization, indexing, workload forecasting, and self-managing databases. The team works on AI-native data infrastructure, including VectorDBs, multi-modal databases, and intelligent infrastructure optimization, with opportunities for publication and open-source contributions.

Must Have

  • Conduct research and development in applying AI/ML techniques to database management systems.
  • Develop intelligent algorithms for tasks such as query planning, indexing, storage management, and workload prediction/scheduling.
  • Collaborate with data infrastructure and engineering teams to integrate AI models into production systems.
  • Analyze large-scale datasets from database workloads to uncover optimization opportunities.
  • Publish findings in top-tier conferences and journals (VLDB, SIGMOD, ICDE, NeurIPS, etc.).
  • Contribute to open-source projects or internal tools supporting AI-enhanced databases.
  • PhD in Computer Science, Data Science or a related field with a focus on databases, systems, or machine learning.
  • Strong publication record in top-tier venues (e.g., SIGMOD, VLDB, ICDE, NeurIPS, etc.) related to the AI4DB area.
  • Strong background in database internals (e.g., PostgreSQL, MySQL, or any modern cloud-native databases or BigData platform).
  • Hands-on experience with machine learning frameworks (e.g. XGBoost, LightGBM, TensorFlow, PyTorch, scikit-learn).

Good to Have

  • Proficiency in Python, C++, or Java.
  • Experience with cloud database platforms (AWS, GCP, Azure).
  • Strong analytical, problem-solving, and communication skills.
  • Familiarity with LLM, reinforcement learning, neural architecture search, or automated database tuning.

Perks & Benefits

  • Competitive compensation
  • Strong research support
  • Innovation-driven environment
  • Day one access to medical, dental, and vision insurance
  • 401(k) savings plan with company match
  • Paid parental leave
  • Short-term and long-term disability coverage
  • Life insurance
  • Wellbeing benefits
  • 10 paid holidays per year
  • 10 paid sick days per year
  • 17 days of Paid Personal Time (prorated upon hire with increasing accruals by tenure)

Job Description

Responsibilities

About the Team The Infrastructure System Lab is a hybrid research and engineering team dedicated to building the next generation of AI-native data infrastructure. Operating at the crossroads of databases, large-scale systems, and AI, the team innovates across multiple domains, including advanced VectorDBs and multi-modal databases for large-scale retrieval and reasoning, intelligent infrastructure optimization using machine learning, LLM-based developer tools like NL2SQL and NL2Chart, and high-performance cache systems for distributed storage and LLM inference. The lab is deeply collaborative, with researchers and engineers working side by side to turn groundbreaking ideas into production-ready systems. Their work is deployed at scale, powering products used by millions, and frequently shared through publications and open-source contributions.

About the Role We are looking for a passionate and skilled professional to join our AI4DB team, where artificial intelligence meets cutting-edge database technology. In this role, you will design and implement intelligent systems that improve the performance, scalability, and usability of modern databases. Your work will span query optimization, indexing strategies, workload forecasting, and the development of self-managing databases. This role offers the chance to solve complex, high-impact problems at the intersection of AI, systems, and software engineering, while collaborating with a top-tier team. You'll have opportunities to publish, contribute to open-source, attend leading conferences, and benefit from competitive compensation, strong research support, and an innovation-driven environment.

Responsibilities

  • Conduct research and development in applying AI/ML techniques to database management systems.
  • Develop intelligent algorithms for tasks such as query planning, indexing, storage management, and workload prediction/scheduling.
  • Collaborate with data infrastructure and engineering teams to integrate AI models into production systems.
  • Analyze large-scale datasets from database workloads to uncover optimization opportunities.
  • Publish findings in top-tier conferences and journals (VLDB, SIGMOD, ICDE, NeurIPS, etc.).
  • Contribute to open-source projects or internal tools supporting AI-enhanced databases.

Qualifications

Minimum Qualifications

  • PhD in Computer Science, Data Science or a related field with a focus on databases, systems, or machine learning.
  • Strong publication record in top-tier venues (e.g., SIGMOD, VLDB, ICDE, NeurIPS, etc.) related to the AI4DB area.
  • Strong background in database internals (e.g., PostgreSQL, MySQL, or any modern cloud-native databases or BigData platform).
  • Hands-on experience with machine learning frameworks (e.g. XGBoost, LightGBM, TensorFlow, PyTorch, scikit-learn).

Preferred Qualifications

  • Proficiency in Python, C++, or Java.
  • Experience with cloud database platforms (AWS, GCP, Azure) is a plus.
  • Strong analytical, problem-solving, and communication skills.
  • Familiarity with LLM, reinforcement learning, neural architecture search, or automated database tuning.

17 Skills Required For This Role

Communication Forecasting Budgeting Cpp Game Texts Postgresql Mysql Aws Azure Data Science Scikit Learn Pytorch Reinforcement Learning Python Algorithms Tensorflow Java Machine Learning

Similar Jobs