About the job
We are now looking for a Senior GPU Architect!The NVIDIA GPU and SoC Architecture group is seeking strong architects with great analytical skills and a deep understanding of system architecture and performance to use your skills creatively on processor and system architecture performance of full applications driving scalable improvements for all of our artificial intelligence/machine learning, automotive, geforce and high-performance computing products. This position offers you the opportunity to have a real impact on the hardware and software that underlies the most exciting trends in modern computing in the world. We are looking for someone who is passionate about and loves what you do and excited about creatively applying what you know to make a difference.
What You'll Be Doing
- Performance analysis/ bottleneck analysis of sophisticated, high performance GPUs and System-on-Chips (SoCs).
- Work on hardware models of different levels of extraction, including performance models, emulators and Silicon, to find performance bottlenecks in the system.
- Work closely with the architecture and design teams to explore architecture trade-offs related to system performance, area, and power consumption.
- Understand key performance use cases or the product. Develop workloads and test suits targeting graphics, machine learning, automotive, video, compute vision applications running on these products.
- Drive methodologies for improving turnaround time, finding representative data-sets and enabling performance analysis early in the product development cycle.
- Develop required infrastructure including performance simulators, Prototype of drivers, compilers and analysis tools.
- BE/BTech or MS/MTech in relevant area or equivalent experience, PhD is a plus with 5+ years of experience
- Proven experience with exposure to performance analysis and sophisticated system on chip and/or GPU architectures.
- Proven history of technical leadership.
- Strong understanding of System-on-Chip (SoC) architecture, graphics pipeline, memory subsystem architecture and Network-on-Chip (NoC)/Interconnect architecture.
- Hands on competence in programming (C/C++) and scripting (Perl/Python). Exposure to Verilog/System Verilog, SystemC/TLM is a strong plus.
- Strong debugging and analysis (including data and statistical analysis) skills, including use for rtl dumps to debug failures.