Senior/Staff/Principal Engineer role specializing in Edge AI with focus on GPU/TPU acceleration for local Large Language Model (LLM) inference. Requires expertise in embedded GPU/TPU architectures, AI inference model optimization, low-latency model inference pipelines, and experience with edge computing platforms and applications.
Must have:
GPU/TPU Acceleration
Edge AI Inference
LLM Optimization
Low-Latency Pipelines
Good to have:
Micro-architecture Development
GPU Hardware Design
Edge Computing Platforms
AI Frameworks
Perks:
Team Leadership
Project Management
Not hearing back from companies?
Unlock the secrets to a successful job application and accelerate your journey to your next opportunity.
Senior/Staff/Principal Engineer – Edge AI LLM 9291
We are seeking a talented Senior/Staff/Principal Engineer with specialized expertise in GPU/TPU acceleration to join our team. The ideal candidate will have extensive hands-on experience in local Large Language Models (LLM) inference with embedded GPU/TPU architectures. As Principal Engineer specializing in Edge AI, you will play a crucial role in shaping the future Edge AI solution, leveraging the power of GPU/TPU acceleration and enterprise grade, large scale edge compute.
The successful candidate will combine technical excellence with effective leadership, creating a positive impact on both projects and team dynamics.
Key Responsibilities:
High-Level Design and Architecture
Influence the Edge AI strategy by providing expert advice on design and architecture.
Make critical decisions regarding technical directions, scalability, and system performance.
Develop and optimize AI inference models for deployment on edge devices with embedded GPU/TPU accelerators, focusing on local Low Latency Model (LLM) inference.
Implement and fine-tune low-latency model inference pipelines to meet real-time performance requirements.
Collaborate with cross-functional teams to integrate AI inference solutions into edge computing platforms and applications.
Collaborate with the GPU Hardware Design Team to design and optimize GPUs that power next-generation devices.
Conduct performance profiling and optimization to maximize the efficiency of GPU/TPU acceleration for local LLM inference.
Work on micro-architecture development, ensuring efficient execution of graphics, compute, and AI workloads within energy and area constraints.
Stay current with advancements in GPU/TPU technologies and edge AI frameworks, incorporating them into solution designs as appropriate.
Provide technical expertise and support to project teams, ensuring successful implementation and deployment of edge AI solutions.
Team Leadership:
Lead and inspire a team of engineers, providing guidance, setting goals, and ensuring collaboration.
Oversee project planning, execution, and delivery, ensuring alignment with business objectives.
Manage all phases of technical projects, from conception to completion.
Develop project specifications, track progress, and control costs.
Foster a positive work environment, encouraging professional growth and knowledge sharing.
undefined
View Full Job Description
Add your resume
80%
Upload your resume, increase your shortlisting chances by 80%