Software Architecture Engineer
NVIDIA
Job Summary
As a Software Architecture Engineer at NVIDIA, you will contribute to the development of AI data centers and networks. This role involves designing and analyzing networking architectures for large-scale AI workloads, optimizing AI-cluster networking through performance modeling, and defining strategic networking solutions. You will collaborate with cross-functional teams to research and validate new networking technologies, making a significant impact on powerful AI Cloud solutions.
Must Have
- Design and analyze end-to-end networking architectures for large-scale AI workloads and distributed training systems
- Push the boundaries of AI-cluster networking through performance modeling, simulation, and optimization techniques
- Define strategic networking solutions for NVIDIA's AI infrastructure in collaboration with adjacent software and hardware architects
- Collaborate closely with cross-functional teams to research, prototype, and validate new networking technologies for AI applications
- Bachelor’s, Master’s, or PhD in Computer Science, Electrical Engineering, or equivalent experience
- 5+ years of experience building large-scale distributed systems or performance-critical software
- Deep understanding of deep learning systems, GPU acceleration, and AI model execution flows
- Solid software engineering skills in C++ and/or Python, with strong familiarity with CUDA or similar platforms
- Strong system-level thinking across memory, networking, scheduling, and compute orchestration
- Excellent communication skills and ability to collaborate across diverse technical domains
Good to Have
- Deep knowledge of networking protocols, system architecture, and end-to-end performance analysis
- Knowledge in high-speed networking technologies, InfiniBand, Ethernet/IP, or data center networking
- Proven experience with network simulation tools, traffic analysis, or networking stack optimization
- Experience in distributed AI training, parallel computing, or HPC cluster architectures
- Understanding of GPU computing, CUDA programming, or AI/ML workload characteristics is a plus
Perks & Benefits
- Competitive salaries
- Generous benefits package
Job Description
Be part of NVIDIA as a Software Architecture Engineer, and contribute to the development of brand new AI data centers and networks. Join a team of hard-working engineers and play a vital role in crafting the future of our powerful AI Cloud solutions. Grow within a dynamic, inclusive environment and make a lasting impact in the tech world.
What you'll be doing:
- Design and analyze end-to-end networking architectures for large-scale AI workloads and distributed training systems.
- Pushing the boundaries of AI-cluster networking through performance modeling, simulation, and optimization techniques.
- Defining strategic networking solutions for NVIDIA's AI infrastructure in collaboration with adjacent software and hardware architects.
- Collaborating closely with cross-functional teams to research, prototype, and validate new networking technologies for AI applications.
What we need to see:
- Bachelor’s, Master’s, or PhD in Computer Science, Electrical Engineering, or equivalent experience.
- 5+ years of experience building large-scale distributed systems or performance-critical software.
- Deep understanding of deep learning systems, GPU acceleration, and AI model execution flows.
- Solid software engineering skills in C++ and/or Python, with strong familiarity with CUDA or similar platforms.
- Strong system-level thinking across memory, networking, scheduling, and compute orchestration.
- Excellent communication skills and ability to collaborate across diverse technical domains.
Ways to stand out from the crowd:
- The position requires deep knowledge of networking protocols, system architecture, and end-to-end performance analysis.
- Knowledge in high-speed networking technologies, InfiniBand, Ethernet/IP, or data center networking.
- Proven experience with network simulation tools, traffic analysis, or networking stack optimization.
- Experience in distributed AI training, parallel computing, or HPC cluster architectures.
- Understanding of GPU computing, CUDA programming, or AI/ML workload characteristics is a plus.
With competitive salaries and a generous benefits package, NVIDIA is widely considered to be one of the most desirable employers in the world. We have some of the most experienced and talented individuals in the world working for us. If you are creative, autonomous, and thrive on challenges, we want to hear from you.