Solutions Architect, Data Center Infrastructure

1 Month ago • 3 Years + • DevOps • $120,000 PA - $235,750 PA

Job Summary

Job Description

NVIDIA seeks a Solutions Architect for Data Center Infrastructure to lead planning and deployments of AI data centers. Responsibilities include data center audits, planning, and deployment, ensuring infrastructure integrity aligns with NVIDIA reference architectures. This involves power distribution, cooling systems, networking, server hardware, storage, and telemetry. The role requires pre-deployment planning, risk identification, vendor training, and infrastructure design evaluation for consistency with industry standards. Testing, troubleshooting, and validation of compute systems are key, along with mentorship and continuous improvement initiatives. Collaboration with internal teams, vendors, and customers is crucial for seamless integration of data center infrastructure solutions.
Must have:
  • 3+ years data center experience
  • Data center operations knowledge
  • Power distribution & cooling expertise
  • Networking & server hardware knowledge
  • Pre-deployment planning skills
  • Strong problem-solving skills
  • Excellent communication skills
  • Willingness to travel (40%)
Good to have:
  • Linux system administration
  • Relevant certifications
Perks:
  • Equity
  • Benefits

Job Details

NVIDIA is seeking a Solutions Architect in Data Center Infrastructure to join our Infrastructure Specialists team. Academic and commercial groups worldwide are using NVIDIA products to redefine deep learning, data analytics, and power data centers. Join the team building many of the world's largest and fastest data centers and supercomputers! NVIDIA is looking for someone who can lead planning and deployments of AI data centers including power/cooling systems, cabling and network provisioning and bring-up/validation.

As the NVIS Solutions Architect for Datacenter Infrastructure, you will focus on data center audit, planning and deployment ensuring the integrity of NVIDIA platform infrastructure. Your primary goal will be to guarantee that all aspects of the data center's physical infrastructure are meticulously planned, implemented, and validated to meet NVIDIA reference architectures, operational requirements, and industry standards. This infrastructure includes architectural systems, power distribution, liquid/air cooling systems, compute, network and cabling (fiber and copper), and telemetry systems.

What you will be doing:

  • NVIS Datacenter Engineering and planning: Collaborate with other teams to plan and implement data center infrastructure solutions based on NVIDIA Datacenter reference architecture, including power distribution, cooling systems, network architecture, server hardware, and storage systems.

  • Plan and manage deployment of NVIDIA's pioneering AI infrastructure solutions including highly complex rack-scale, liquid cooled compute and networking hardware systems, in a fluid and fast paced environment.

  • Conduct pre-deployment planning including reviewing cluster and data center architecture, plan network port mapping and fiber optic cabling BOM, identify potential risks, train vendors and find areas for improvement.

  • Evaluate customers' and partners' infrastructure design proposals for consistency with industry standards and regulatory requirements. Provide feedback and recommendations to improve performance, scalability, and cost-effectiveness.

  • Perform testing, troubleshooting and validation of compute systems based on collaboration with product and engineering teams.

  • Act as the NVIS mentor providing guidance, mentorship, and support to ensure the NVIS team's success in their respective roles.

  • Quality Assurance: Establish and enforce quality assurance processes to verify that deployments meet established specifications and performance benchmarks. Conduct thorough bring-up, testing, and validation to validate the functionality and reliability of infrastructure components.

  • Continuous Improvement: Drive continuous improvement initiatives to enhance data center infrastructure efficiency for NVIDIA data center reference architecture and deployment blueprint, resilience, and sustainability. Find opportunities to streamline processes, automate repetitive tasks, and leverage emerging technologies to optimize infrastructure operations.

  • Collaboration and Communication: Collaborate and communicate across internal teams, external vendors, and customers to facilitate the seamless integration of data center infrastructure solutions. Serve as a domain expert and point of contact for infrastructure-related inquiries and blocking issues.

What we need to see:

  • Bachelor's degree (or equivalent experience) in Engineering, Computer Science, Information Technology, or a related field.

  • 3+ years of overall experience in enterprise and/or hyperscale data centers with continual infrastructure deployment experience, preferably for high density AI/HPC data centers.

  • Working experience in data center operations, or infrastructure management roles, focusing on large-scale data center deployments.

  • Strong technical knowledge and experience in the data center stack - power distribution, liquid cooling, servers, networking, storage and pre-deployment planning

  • Relevant certification – preferred

  • Demonstrated technical and project leadership under fluid situations, ability to adapt to unknowns and change.

  • Excellent analytical, problem-solving, and decision-making skills, keen attention to detail, and a commitment to quality.

  • Excellent communication and interpersonal abilities, capable of engaging with various collaborators like customers to enable productive discussions.

  • Organization & Time Management – able to plan, schedule, and organize tasks related to the job to achieve goals within or ahead of established time frames.

  • Willingness to travel (40%).

Way to stand out from the crowd:

  • Linux system administration skills

  • Strong knowledge of whole data center Infrastructure stack

  • Flexible/agile and enjoys solving challenging problems

NVIDIA is widely considered one of the world's most desirable employers in technology. We have some of the world's most forward-thinking and passionate people working for us. If you're creative and autonomous, we want to hear from you!

The base salary range is 120,000 USD - 235,750 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Similar Jobs

Google - Customer Engineer II, AI/ML, Google Cloud

Google

Sunnyvale, California, United States (On-Site)
1 Month ago
Google - Software Engineer III, Generative AI

Google

Sunnyvale, California, United States (On-Site)
1 Month ago
NVIDIA - Senior Financial Analyst - Forward Cost Model

NVIDIA

Santa Clara, California, United States (On-Site)
1 Month ago
ByteDance - Research Scientist Intern (Doubao (Seed) - Foundation Model, Speech Understanding) - 2024 Summer (PhD)

ByteDance

San Jose, California, United States (On-Site)
6 Months ago
NVIDIA - Senior Application Software Engineer, Performance

NVIDIA

Shanghai, Shanghai, China (On-Site)
1 Month ago
Easygo - Senior DevOps Engineer

Easygo

Melbourne, Victoria, Australia (On-Site)
3 Months ago
Wipro - Release Manager

Wipro

Bengaluru, Karnataka, India (On-Site)
6 Months ago
Saama Technologies,  Inc  - Senior Site Reliability Engineer

Saama Technologies, Inc

Chennai, Tamil Nadu, India (On-Site)
7 Months ago
Zazz - Data Engineer

Zazz

(Remote)
4 Months ago
Google - Software Engineer III, Site Reliability Engineering

Google

Warsaw, Masovian Voivodeship, Poland (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Microsoft - Technical Program Manager, AI

Microsoft

Mountain View, California, United States (Hybrid)
1 Month ago
ASSIST Software - Other Positions

ASSIST Software

Suceava, Suceava County, Romania (Remote)
6 Months ago
ByteDance - Student Researcher (Foundation Models - Reasoning, Planning & Agent) - Doubao (Seed) - 2025 Start (PhD)

ByteDance

Seattle, Washington, United States (On-Site)
6 Months ago
Krafton  - Deep Learning Research Scientist - AI Safety

Krafton

Seoul, South Korea (On-Site)
2 Months ago
Google - Solution Engineer, Innovation, Cloud Solution Accelerator Workshops

Google

New York, New York, United States (On-Site)
1 Month ago
ByteDance - Research Scientist Graduate (Foundation Model - Vision and Language)

ByteDance

Seattle, Washington, United States (On-Site)
4 Months ago
NVIDIA - Deep Learning Algorithm Engineer - New College Grad 2025

NVIDIA

Santa Clara, California, United States (On-Site)
1 Month ago
NVIDIA - Senior Site Reliability Engineer - AI Research Clusters

NVIDIA

Pune, Maharashtra, India (On-Site)
1 Month ago
NVIDIA - Silicon Reliability Engineer

NVIDIA

Santa Clara, California, United States (Hybrid)
3 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Texas, United States

Canva - Higher Education Account Executive

Canva

New York, New York, United States (Remote)
6 Months ago
Universal Music - Manager, Revenue

Universal Music

Los Angeles, California, United States (On-Site)
2 Months ago
Fluence - Sales Engineer/Senior Sales Engineer - Battery Energy Storage

Fluence

Arlington, Virginia, United States (Hybrid)
7 Months ago
Yodo1 - Business Development Manager, Game Publishing

Yodo1

United States (Remote)
5 Months ago
Obsidian Entertainment - Senior Producer

Obsidian Entertainment

Irvine, California, United States (On-Site)
3 Months ago
Canva - Senior Manager, Financial Reporting & Technical Accounting

Canva

San Francisco, California, United States (Remote)
2 Months ago
Alpha Sense - Product Specialist, Financial Services

Alpha Sense

New York, New York, United States (On-Site)
5 Months ago
ByteDance - Software Engineer Intern (Recommendation Infrastructure - Data Architecture)

ByteDance

Seattle, Washington, United States (On-Site)
1 Month ago
Patreon - Engineering Manager

Patreon

New York, New York, United States (Hybrid)
2 Months ago

Get notifed when new similar jobs are uploaded

DevOps Jobs

Consilio LLC - Infrastructure Site Reliability Engineer

Consilio LLC

Bengaluru, Karnataka, India (On-Site)
7 Months ago
Rush Street Interactive - Full-Stack Automation Engineer

Rush Street Interactive

Serbia (On-Site)
4 Months ago
Extreme Network - SR PROGRAMMER - Oracle Fusion Cloud- VBCS/ BI Reports/ OTBI/FRS & SmartView

Extreme Network

Chennai, Tamil Nadu, India (Hybrid)
7 Months ago
Hitachi - F&O Technical_New_Consultant

Hitachi

Hyderabad, Telangana, India (On-Site)
7 Months ago
Lost Boys Interactive - Senior DevOps Engineer

Lost Boys Interactive

(Remote)
4 Months ago
Google - Customer Engineer, Infrastructure Modernization, Google Cloud

Google

Santiago, Santiago Metropolitan Region, Chile (On-Site)
1 Month ago
Microsoft - ROP - Software Engineer II

Microsoft

Bengaluru, Karnataka, India (On-Site)
1 Month ago
Fortis Games - Senior DevOps Engineer

Fortis Games

Canada (On-Site)
4 Months ago
Google - Site Reliability Engineer, Home and Assistant, Infrastructure

Google

Bengaluru, Karnataka, India (On-Site)
1 Month ago
NVIDIA - Senior Site Reliability Engineer - AI Research Clusters

NVIDIA

Santa Clara, California, United States (Hybrid)
4 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, ignited the era of modern AI and is fueling the creation of the metaverse. NVIDIA is now a full-stack computing company with data-center-scale offerings that are reshaping industry.

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Massachusetts, United States (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Texas, United States (On-Site)

Santa Clara, California, United States (Hybrid)

Santa Clara, California, United States (Hybrid)

Pune, Maharashtra, India (On-Site)

View All Jobs

Get notified when new jobs are added by NVIDIA

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug