Software Engineer, LLM Inference Scheduling Intern

3 Months ago • All levels • Software Development & Engineering

Job Summary

Job Description

This internship role involves scheduling and orchestration of heterogeneous resources for LLM, including computing power pooling, elastic resource mixed deployment, and quota management. The intern will participate in multi-role, multi-stage scheduling for LLM inference services, focusing on KVCache-centric scheduling for dynamic scaling. They will optimize computing resource allocation, including RDMA network and cache/storage resources, across distributed clusters. Responsibilities also include ensuring LLM service stability, problem diagnosis, and recovery across diverse environments. Tasks involve task/service scheduling within multi-datacenter, multi-region, and multi-cloud scenarios to ensure load distribution.
Must have:
  • Proficiency in C++, Go, Python, or Shell in Linux.
  • Understanding of Kubernetes and container technologies (Docker, etc.).
  • Excellent logical analysis and business logic breakdown skills.
  • Strong work ethic, learning ability, and communication skills.
  • Ability to create and update workflow and technical documents.
Good to have:
  • Experience with machine learning frameworks or inference engines.
  • Practical experience in resource scheduling of large models.
  • Understanding of GPU systems and architectures.
  • Experience publishing papers at top academic conferences.

Job Details

About the Team ByteDance Doubao Large Model Team was established in 2023, dedicated to developing the most advanced AI large model technology in the industry, becoming a world-class research team, and contributing to the development of technology and society. The Doubao large model team has a long-term vision and determination in the field of AI, with research directions covering NLP, CV, speech, etc. They have laboratories and research positions in China, Singapore, the US and other places. The team relies on sufficient data, computing and other resources on the platform, continuously invests in related fields, and has launched self-developed general large models, providing MultiModal Machine Learning capabilities. Downstream support includes 50 + businesses such as Doubao, Coze, Dreamina, and is open to enterprise customers through Volcengine. Currently, Doubao APP has become the largest AIGC application in the Chinese market. 1. Engage in the scheduling and orchestration of heterogeneous resources for LLM, including computing power pooling, elastic resource mixed deployment, tidal resource borrowing, and quota management. 2. Participate in the multi-role, multi-stage scheduling of PD and EP scenarios for LLM inference services, as well as KVCache-centric scheduling, in order to achieve dynamic, timely, and accurate scaling in and out for services 3. Participate in the optimal scheduling of computing resources, RDMA high-speed network resources, and cache/storage resources via technical means, and fully utilize the computing power of large-scale distributed clusters 4. Contribute to the stability of LLM services and accomplish problem localization, diagnosis, isolation, and rapid recovery across diverse heterogeneous resources (GPU, CPU, NPU, etc), multi-cloud environments, and various network traffic scenarios via online and offline multi-system linkage 5. Engage in task/service scheduling within multi-datacenter, multi-region, and multi-cloud scenarios to ensure the rational distribution of load Qualifications Minimum Qualifications: 1. Demonstrate proficiency in at least 1 to 2 languages, such as C++, Go, Python, and Shell, within the Linux environment. 2. Possess an understanding of Kubernetes architecture and ecosystem, and be familiar with container technologies including Docker, Containerd, Kata, and Podman. 3. Exhibit excellent logical analysis skills, with the ability to reasonably abstract and break down business logic. 4. Have a strong sense of work responsibility, good learning capabilities, communication skills, and self-motivation, and be capable of rapid response and action. 5. Maintain good habits of creating working documents, and write and update workflow and technical documents in a timely manner as required. Preferred Qualifications: 1. Familiar with at least one mainstream machine learning framework or inference engine and have relevant experience in optimizing the inference performance of large models. 2. Have practical experience in resource scheduling and service orchestration of large models, and have participated in the design, development, and maintenance of large-scale distributed systems. 3. Have an understanding of GPU systems and architectures. 4. Have experience in publishing papers at top academic conferences in the field of computer systems (including but not limited to OSDI, NSDI, SOSP, FAST, MLSYS, Eurosys). Candidates can apply to a maximum of two positions and will be considered for jobs in the order you apply. The application limit is applicable to ByteDance and its affiliates' jobs globally. Applications will be reviewed on a rolling basis - we encourage you to apply early. Successful candidates must be able to commit to at least 3 months long internship period. By submitting an application for this role, you accept and agree to our global applicant privacy policy, which may be accessed here: https://jobs.bytedance.com/en/legal/privacy. If you have any questions, please reach out to us at apac-earlycareers@bytedance.com Job Information About Doubao (Seed) Founded in 2023, the ByteDance Doubao (Seed) Team, is dedicated to pioneering advanced AI foundation models. Our goal is to lead in cutting-edge research and drive technological and societal advancements.​ With a strong commitment to AI, our research areas span deep learning, reinforcement learning, Language, Vision, Audio, AI Infra and AI Safety. Our team has labs and research positions across China, Singapore, and the US.​ Why Join ByteDance Inspiring creativity is at the core of ByteDance's mission. Our innovative products are built to help people authentically express themselves, discover and connect – and our global, diverse teams make that possible. Together, we create value for our communities, inspire creativity and enrich life - a mission we work towards every day.​ As ByteDancers, we strive to do great things with great people. We lead with curiosity, humility, and a desire to make impact in a rapidly growing tech company. By constantly iterating and fostering an "Always Day 1" mindset, we achieve meaningful breakthroughs for ourselves, our Company, and our users. When we create and grow together, the possibilities are limitless. Join us.​ Diversity & Inclusion​ ByteDance is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workplace. At ByteDance, our mission is to inspire creativity and enrich life. To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach. We are passionate about this and hope you are too.​

Similar Jobs

Mysteria Studio - Intern Game Designer

Mysteria Studio

Germany (Remote)
1 Month ago
AECOM - Project Manager (Renewables Focus)

AECOM

Dallas, Texas, United States (Remote)
2 Months ago
Playdawn Consulting - Lead Technical Artist

Playdawn Consulting

Pune, Maharashtra, India (On-Site)
3 Months ago
QuinStreet - Mid-Market Sales Executive

QuinStreet

United States (Remote)
2 Months ago
CommerceIQ - Software Development Engineer Testing II

CommerceIQ

Bengaluru, Karnataka, India (On-Site)
3 Weeks ago
Qualcomm - Packaging Equipment Engineer

Qualcomm

Wuxi, Jiangsu, China (On-Site)
3 Weeks ago
Qualcomm - Staff RF Analog IC Design Engineer

Qualcomm

Cambridge, England, United Kingdom (On-Site)
3 Months ago
Cadence - STA Principal Application Engineer

Cadence

San Jose, California, United States (On-Site)
1 Month ago
Bede Gaming - Staff Software Engineer

Bede Gaming

Sofia, Sofia City Province, Bulgaria (Hybrid)
3 Months ago
Apple - Senior Software Engineer, Internationalization

Apple

Cupertino, California, United States (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

onwards Search - Social Media Trend Producer

onwards Search

Santa Monica, California, United States (On-Site)
1 Month ago
Marsh McLennan - Data Engineering Analyst

Marsh McLennan

Mexico City, Mexico (Hybrid)
2 Months ago
Addepar - Director, Data Operations

Addepar

Pune, Maharashtra, India (Hybrid)
2 Months ago
hello games - QA Tester

hello games

United Kingdom (On-Site)
8 Months ago
Accenture - Application Support Engineer

Accenture

Navi Mumbai, Maharashtra, India (On-Site)
1 Month ago
Applied materials  - IT Solutions Management

Applied materials

Bengaluru, Karnataka, India (On-Site)
3 Weeks ago
Genies.io - Senior Backend Engineer

Genies.io

Los Angeles, California, United States (On-Site)
3 Months ago
CyberArk - Senior Front End Engineer

CyberArk

Sofia, Sofia City Province, Bulgaria (Hybrid)
3 Months ago
Ubisoft - Lead Technical Artist

Ubisoft

Annecy, Auvergne-Rhône-Alpes, France (On-Site)
5 Months ago
Trackman - Customer Service Agent

Trackman

Bogotá, Bogota, Colombia (On-Site)
3 Weeks ago

Get notifed when new similar jobs are uploaded

Jobs in Singapore

Razer - Senior Software Engineer II - Architecture

Razer

Singapore (On-Site)
1 Year ago
bytedance - Legal Counsel, Global AI Products

bytedance

Singapore (On-Site)
4 Months ago
bytedance - OCBP - Global Monetization Product and Technology

bytedance

Singapore (On-Site)
6 Months ago
net ease group - Procurement Business Partner

net ease group

Singapore (On-Site)
2 Months ago
hogarth - Account Manager

hogarth

Singapore (On-Site)
2 Months ago
Alpha Sense - Strategic Account Executive, Financial Services

Alpha Sense

Singapore, Singapore (On-Site)
8 Months ago
PayPal - AML Compliance Manager

PayPal

Singapore (Hybrid)
1 Month ago
bytedance - Risk Control Business Partner

bytedance

Singapore (On-Site)
4 Months ago
Nium - Director - Business Development

Nium

Singapore (Hybrid)
6 Months ago
Animoca Brands - Investment and Strategic Partnership Associate/Manager

Animoca Brands

Singapore, Singapore (Hybrid)
4 Months ago

Get notifed when new similar jobs are uploaded

Software Development & Engineering Jobs

AECOM - Signaling Engineer

AECOM

New York, United States (On-Site)
1 Month ago
Yodlee - Senior Software Engineer

Yodlee

Berwyn, Pennsylvania, United States (Remote)
3 Weeks ago
DevRev - Senior Member of Technical Staff – Search & Recommendation Systems

DevRev

Bengaluru, Karnataka, India (On-Site)
2 Months ago
Qualcomm - Engineer, Staff -Linux

Qualcomm

Hyderabad, Telangana, India (On-Site)
3 Months ago
Penumbrainc - Manufacturing Engineer II, Operations

Penumbrainc

Alameda, California, United States (On-Site)
1 Month ago
Clearwater Analytics - Senior Software Development Engineer

Clearwater Analytics

Noida, Uttar Pradesh, India (On-Site)
3 Weeks ago
Sagecor - Software Engineer 2

Sagecor

Annapolis Junction, Maryland, United States (On-Site)
1 Month ago
The Walt Disney Company - Senior Engineer-Controls

The Walt Disney Company

Lake Buena Vista, Florida, United States (On-Site)
7 Months ago
Ion - Senior Software Engineer, Italy

Ion

Milan, Lombardy, Italy (On-Site)
10 Months ago
Google - Silicon Architecture/Design Engineer

Google

Bengaluru, Karnataka, India (On-Site)
3 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Founded in 2012, ByteDance's mission is to inspire creativity and enrich life. With a suite of more than a dozen products, including TikTok as well as platforms specific to the China market, including Toutiao, Douyin, and Xigua, ByteDance has made it easier and more fun for people to connect with, consume, and create content.
View All Jobs

Get notified when new jobs are added by bytedance

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug