AI System and Software Research Intern

Intel

1+ Years | Beijing, China (Hybrid) | Full Time | 1 weeks ago

Apply Now

Job Summary

This AI System and Software Research Intern role focuses on GPU/NPU software development and optimization, including implementing high-performance kernels and profiling for bottlenecks. The intern will also prototype, develop, and tune Robotics AI systems, collaborating with algorithm and hardware teams to deploy models with real-time constraints. Additionally, the role involves research into Agentic Systems, designing and accelerating KV Cache for large model inference and exploring agent-based inference frameworks in robotic AI scenarios. Candidates should be pursuing a Master's or Ph.D. in relevant fields, proficient in C/C++ and Python, with solid CUDA experience and familiarity with AI models and profiling tools.

Must Have

Currently enrolled in a Master's or Ph.D. program (Computer Science, Electrical Engineering, AI, Mathematics, or related fields)
Proficient in C/C++ and Python
Solid understanding of the CUDA programming model
1 year of hands-on CUDA experience (kernel development, streams, memory management, optimization)
Experience with profiling tools such as Nsight, VTune, Perf, TensorBoard
Familiarity with Transformers, CNNs, RNNs and typical performance bottlenecks during inference
Good reading/writing skills in English
Effective teamwork across multidisciplinary groups
Strong passion for pushing extreme boundaries of GPU/NPU acceleration, robotics AI, and Agentic systems

Good to Have

Experience with KV Cache, attention mechanism optimization, or model compression (quantization, pruning, distillation)
Hands-on work with Agentic/agent based AI frameworks (e.g., ReAct, Tool Use, Auto GPT)
Development experience on NPUs or other heterogeneous accelerators
Contributions to open source projects such as TensorRT, ONNX Runtime, OneAPI
Linux system tuning, driver development, or low-level hardware interface knowledge

Perks & Benefits

Hybrid work model

Job Description

Job Description:

GPU/NPU Software Development and Optimization
Implement high performance kernels, operators, and libraries for GPU/NPU.
Profile with Nsight Systems/Compute, VTune, Perf, TensorBoard, etc., identify bottlenecks and apply code level optimizations.
Robotics AI System Prototyping, Development and Tuning
Collaborate with Algorithm and Hardware teams to deploy various models on development platforms (GPU/NPU-based) with real time performance constraints.
Build automated benchmarks, generate performance reports, and propose optimization strategies.
Agentic System Research (KV-Cache etc.)
Design, implement, and accelerate KV Cache etc. for large model inference.
Explore and prototype Agentic (agent based, self adapting) inference frameworks evaluate them in robotic AI scenarios.

Qualifications:

Currently enrolled in a Master's or Ph.D. program (Computer Science, Electrical Engineering, AI, Mathematics, or related fields).
Proficient in C/C++ and Python; ability to write clean, maintainable code.
Solid understanding of the CUDA programming model; 1year of hands on CUDA experience (kernel development, streams, memory management, optimization).
Experience with profiling tools such as Nsight, VTune, Perf, TensorBoard, etc.
Familiarity with Transformers, CNNs, RNNs and the typical performance bottlenecks during inference.
Good reading/writing skills in English; effective teamwork across multidisciplinary groups.
Strong passion for pushing extreme boundaries of GPU/NPU acceleration, robotics AI, and Agentic systems.

Skills as Plus:

Experience with KV Cache, attention mechanism optimization, or model compression (quantization, pruning, distillation).
Hands on work with Agentic/agent based AI frameworks (e.g., ReAct, Tool Use, Auto GPT).
Development experience on NPUs or other heterogeneous accelerators.
Contributions to open source projects such as TensorRT, ONNX Runtime, OneAPI, etc.
Linux system tuning, driver development, or low level hardware interface knowledge.

8 Skills Required For This Role

Team Management Cpp Game Texts Cuda React Prototyping Linux Python

Similar Jobs