Senior/Principal DL/LLM Performance Architect

rivos

Job Summary

Join a cutting-edge, well-funded hardware startup in Silicon Valley as a Deep Learning and Large Language Model Performance Architect. The company aims to reimagine silicon and create Risc-V based Accelerated computing platforms that will transform the industry. You will collaborate with talented engineers to develop designs that push the boundaries of performance, energy efficiency, and scalability in a fun, creative, and flexible work environment. Responsibilities include analyzing the performance of key workloads, tuning software, proposing improvements, developing analytical models for target systems, identifying performance bottlenecks, and making recommendations to implementation teams. You will also work with deep learning software engineers and hardware architects, adapt to the evolving AI industry, and contribute across the codebase. Tasks involve pre-silicon and post-silicon performance validation.

Must Have

  • MS or PhD in CS, EE, Math, or equivalent
  • 5+ years of experience
  • In-depth knowledge of DL or LLM models
  • Strong background in computer architecture or AI software stack/compilers
  • Strong C/C++ programming skills
  • Strong hardware modeling skills
  • Strong problem-solving and analytical thinking

Good to Have

  • Performance modeling and analysis background
  • GPU programming experience (CUDA)
  • LLVM/MLIR development experience
  • Good communication and organizational skills

Job Description

Join a cutting-edge and well-funded hardware startup in Silicon Valley as an Deep Learning and Large Language Model Performance Architect. Our mission is to reimagine silicon and create Risc-V based Accelerated computing platforms that will transform the industry. You will have the opportunity to work with some of the most talented and passionate engineers in the world to create designs that push the envelope on performance, energy efficiency and scalability. We offer a fun, creative and flexible work environment, with a shared vision to build products to change the world.

Job Responsibility
* Workload Analysis - Analyzing the performance of important workloads, tuning our current software, and proposing improvements for future software.
* Performance modeling and analysis - develop analytical model for target systems and analyze the performance bottleneck. make recommendations to the implementation teams. Working with cross-collaborative teams of deep learning software engineers and hardware architects to develop innovative solutions. Adapting to the constantly evolving AI industry by being agile and excited to contribute across the codebase.
* Pre-silicon and post-silicon performance validation
 
Qualification
* MS or PhD in relevant discipline (CS, EE, Math) or equivalent experience with 5+ years working experiences
* In-depth knowledge of deep learning models or large language models
* Strong background in computer architecture or AI software stack/compilers
* Strong C/C++ programming and hardware modeling skills
* Strong problem solving and analytical thinking skills
* Performance modeling and analysis background a plus
* GPU programming experience (CUDA) a plus
* LLVM/MLIR development experience a plus
* Good communication and organizational skills

5 Skills Required For This Role

Communication Cpp Agile Development Cuda Deep Learning