Job Details:
Job Description:
Join Intel’s cutting-edge NPU (Neural Processing Unit) project, as part of a multidisciplinary team that spans compiler development, model enablement, and large language model (LLM) engineering. You will play a key role in building the software stack that powers next-generation AI workloads on Intel’s NPUs, enabling efficient, dynamic, and scalable deployment of advanced machine learning models—including quantized and dynamic models.
Responsibilities:
- Design, implement, and optimize compiler features for Intel NPU architectures, ensuring high performance and efficient code generation.
- Enable and optimize machine learning models—including LLMs—for deployment on Intel NPUs, focusing on quantization, dynamic execution, and hardware-specific acceleration.
- Collaborate with other teams to productionize new models and support emerging AI workloads.
- Develop and maintain model conversion, quantization, and deployment pipelines, ensuring correctness, reproducibility, and compliance with Intel’s standards.
- Benchmark, profile, and debug models and software components to identify and resolve performance bottlenecks.
- Stay current with the latest advancements in compilers, model optimization, quantization, and LLM research, and integrate best practices into the NPU software stack.
- Write clear technical documentation and provide support to internal and external users.
Qualifications:
- Bachelor’s, Master’s, or Ph.D. in Computer Science, Electrical Engineering, or related field.
- Proven expertise in C/C++ with strong software design and optimization skills.
- Solid understanding of AI model optimization techniques such as quantization, pruning, and distillation.
- Familiarity with large language models and their deployment requirements.
- Knowledge of computer architecture, hardware acceleration, and low-level performance tuning.
- Proficiency with Linux environments, virtualization, and CI/CD workflows.
- Strong analytical, problem-solving, and cross-team collaboration skills in fast-moving technical settings.
- Experience with modern compiler infrastructures (e.g., LLVM, MLIR) or code generation for custom accelerators is a plus.
- Hands-on experience with AI frameworks (OpenVINO, TensorFlow, PyTorch, ONNX), Python, and performance tools for NPUs, GPUs, or FPGAs is a plus.
- Experience developing dynamic execution or runtime systems that handle variable input sizes and adaptive behavior is a plus.
- Familiarity with collaborative tools (GitHub, Jira) and open-source contribution practices is a plus.
Job Type:
Experienced Hire
Shift:
Shift 1 (Romania)
Primary Location:
Romania, Timisoara
Additional Locations:
Business group:
The Client Computing Group (CCG) is responsible for driving business strategy and product development for Intel's PC products and platforms, spanning form factors such as notebooks, desktops, 2 in 1s, all in ones. Working with our partners across the industry, we intend to deliver purposeful computing experiences that unlock people's potential - allowing each person use our products to focus, create and connect in ways that matter most to them.
Posting Statement:
All qualified applicants will receive consideration for employment without regard to race, color, religion, religious creed, sex, national origin, ancestry, age, physical or mental disability, medical condition, genetic information, military and veteran status, marital status, pregnancy, gender, gender expression, gender identity, sexual orientation, or any other characteristic protected by local law, regulation, or ordinance.
Position of Trust
N/A
Work Model for this Role
This role will be eligible for our hybrid work model which allows employees to split their time between working on-site at their assigned Intel site and off-site. * Job posting details (such as work model, location or time type) are subject to change.