Senior Math Libraries Engineer, CPU and GPU Optimization

NVIDIA

Job Summary

NVIDIA is seeking an expert software engineer to develop and optimize CUDA-X libraries for CPU and GPU ecosystems. This role involves designing modern, flexible APIs and kernels for mathematical libraries, collaborating with internal and external partners to understand requirements, and delivering timely releases. The engineer will contribute to NVIDIA's accelerated computing platform, which is crucial for HPC and AI applications, and continuously survey software system trends to ensure the libraries are future-proof and high-performing across evolving hardware and software environments.

Must Have

  • Design modern, flexible, and easy to use APIs and kernels for math libraries and lead design reviews with all collaborators.
  • Work closely with internal (e.g., Engineering, Product Management) and external partners such as researchers to understand their use cases and requirements.
  • Work with internal and external customers to deliver timely math libraries releases.
  • Become a domain expert by continuously surveying current trends in software systems.
  • Advanced C++ skills, including modern design paradigms (e.g., template meta-programming, RAII).
  • Parallel programming experience with CUDA, OpenCL or vector programming on CPU (AVX, NEON or similar).
  • Experience with ARM, RISC-V and/or x86_64 CPU architectures.

Good to Have

  • Strong background in numerical methods (e.g., FFT, numerical linear algebra).
  • Programming skills with Python, and modern automation setups for both building software (e.g. cmake) as well as testing it (e.g. CI/CD, sanitizers).
  • Background with cross-compilation, setting up CPU/GPU/accelerator (cross-)compilation toolchains and bringing existing codes to new architectures.
  • Experience with CCCL, OpenMP, OpenACC, multi-threading, MPI, PGAS.
  • Experience with scientific and deep learning libraries and frameworks such as PyTorch, JAX, MKL, MAGMA, PETSc, Kokkos, etc.

Perks & Benefits

  • Competitive salaries
  • Generous benefits package

Job Description

Job Requisition ID

JR2004474

Job Category

Engineering

Time Type

Full time

NVIDIA is looking for an expert software engineer to help us deliver CUDA-X libraries across the NVIDIA CPU and GPU ecosystem. For over a decade, NVIDIA's accelerated computing platform has revolutionized HPC and AI with applications ranging from COVID-19 research to autonomous machines. Did you know that our team develops the GPU/CPU-accelerated mathematical libraries that make all of this possible?

The hardware and software accelerated computing ecosystem is constantly evolving, including shifts towards hybrid backends, deep integration with high-level languages and ecosystems (such as Python, Numpy, JAX, MLIR…), and optimization at runtime for maximum flexibility and performance. Our libraries follow CUDA Everywhere approach to let developers use highly-optimized mathematical operations on all hardware available in NVIDIA ecosystem. You will be part of a team designing, developing, and optimizing math libraries for the future. If you are passionate about designing modern HPC libraries and want to build software that will stand the test-of-time as it accelerates countless applications, we might have the dream job you have been waiting for!

What you'll be doing:

  • Design modern, flexible, and easy to use APIs and kernels for math libraries and lead design reviews with all collaborators.
  • Work closely with internal (e.g., Engineering, Product Management) and external partners such as researchers to understand their use cases and requirements.
  • Work with internal and external customers to deliver timely math libraries releases.
  • Become a domain expert by continuously surveying current trends in software systems.

What we need to see:

  • PhD or MSc degree in Computer Science, Applied Math, or a related science or engineering field is preferred (or equivalent experience).
  • 12+ years of experience designing and developing software for high-performance computing and/or AI applications.
  • Advanced C++ skills, including modern design paradigms (e.g., template meta-programming, RAII).
  • Parallel programming experience with CUDA, OpenCL or vector programming on CPU (AVX, NEON or similar).
  • Strong collaboration, communication, and documentation habits.
  • Experience with ARM, RISC-V and/or x86_64 CPU architectures.

Ways to stand out from the crowd:

  • Strong background in numerical methods (e.g., FFT, numerical linear algebra).
  • Programming skills with Python, and modern automation setups for both building software (e.g. cmake) as well as testing it (e.g. CI/CD, sanitizers).
  • Background with cross-compilation, setting up CPU/GPU/accelerator (cross-)compilation toolchains and bringing existing codes to new architectures.
  • Experience with CCCL, OpenMP, OpenACC, multi-threading, MPI, PGAS.
  • Experience with scientific and deep learning libraries and frameworks such as PyTorch, JAX, MKL, MAGMA, PETSc, Kokkos, etc.

With competitive salaries and a generous benefits package, we are widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us and, due to unprecedented growth, our exclusive engineering teams are rapidly growing. If you're a creative and autonomous engineer with a real passion for technology, we want to hear from you!

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. For Poland: The base salary range is 375,000 PLN - 650,000 PLN for Level 5, and 483,750 PLN - 838,500 PLN for Level 6.

Insights from previous hires

Top skills

Amazon Web Services (AWS)

Assembly Language

Artificial Intelligence

C (Programming Language)

API

Computing

Previously worked as

1. Staff Software Engineer

2. Software Engineer

3. Senior Software Engineer

4. Lead Software Engineer

5. Principal Software Engineer

16 Skills Required For This Role

Cpp Game Texts Openacc Cuda Opencl Mathematical Aws Assembly Language Amazon Web Services Numpy Pytorch Deep Learning Ci Cd Python C Make Linear Algebra

Similar Jobs