Deep Learning Performance Architect

NVIDIA

Job Summary

As a Deep Learning Performance Architect at NVIDIA, you'll benchmark and analyze AI workloads, develop simulators and debuggers in C++/Python, evaluate performance, power, and area (PPA) trade-offs, and collaborate with architecture teams. Responsibilities include analyzing hardware features, keeping abreast of deep learning trends, and communicating technical concepts effectively. You'll contribute to building real-time, cost-effective computing platforms for AI applications.

Must Have

  • MS/PhD in relevant field (CS, EE, Math)
  • 2+ years experience in parallel computing
  • Strong programming skills (C, C++, Python)
  • Proficiency in architecture analysis & modeling
  • Excellent problem-solving skills

Good to Have

  • Understanding of transformer-based models
  • Experience with benchmarking and workload profiling

Job Description

NVIDIA has continuously reinvented itself. Our invention of the GPU sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. Today, research in artificial intelligence is booming worldwide, which calls for highly scalable and massively parallel computation horsepower that NVIDIA GPUs excel.

NVIDIA is a “learning machine” that constantly evolves by adapting to new opportunities that are hard to solve, that only we can address, and that matter to the world. This is our life’s work , to amplify human creativity and intelligence. As an NVIDIAN, you’ll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join our diverse team and see how you can make a lasting impact on the world!

Intelligent machines powered by Artificial Intelligence computers that can learn, reason and interact with people are no longer science fiction. GPU Deep Learning has provided the foundation for machines to learn, perceive, reason and solve problems. NVIDIA's GPUs run AI algorithms, simulating human intelligence, and act as the brains of computers, robots and self-driving cars that can perceive and understand the world. Increasingly known as “the AI computing company”, NVIDIA wants you. Come, join our Deep Learning Architecture team, where you can help build real-time, cost-effective computing platforms driving our success in this exciting and rapidly growing field!

What you'll be doing:

  • Benchmark and analyze AI workloads in single and multi-node configurations.

  • High level simulator and debugger development in C++/Python.

  • Evaluate PPA (performance, power, area) for hardware features and system-level architectural trade-offs.

  • Work closely with wider architecture teams, architecture and product management to help with trade-off analysis at every stage of the project.

  • Keep abreast with emerging trends and research in deep learning.

What we need to see:

  • MS or PhD in a relevant discipline (CS, EE, Math).

  • 2+ years of experience in parallel computing architectures, interconnect fabrics and deep learning applications.

  • Strong programming skills in C, C++ and Python.

  • Proficiency in architecture analysis and performance modeling.

  • Curious mindset with excellent problem solving skills.

Ways to stand out from the crowd: 

  • Understanding of modern transformer-based model architectures.

  • Experience with benchmarking, projections methodologies, workload profiling and correlation.

  • Ability to simplify and communicate rich technical concepts with non-technical audience.

#LI-Hybrid

7 Skills Required For This Role

Excel Cpp Game Texts Deep Learning Python Algorithms Css

Similar Jobs