Data Center Cluster Architect

4 Months ago • All levels • $207,800 PA - $378,700 PA
Data Analysis

Job Description

The Data Center Systems Architecture team is seeking a Cluster Architect to design and optimize computer architectures for high-performance computing (HPC) clusters. This role involves creating complex system architectures and meeting product goals related to performance, size, power, thermal, and cost. The architect will define infrastructure details, collaborate with engineering teams on cluster network integration, and ensure efficient data flow. Responsibilities include defining rack and cluster configurations, designing optimized networks for AI/ML clusters, influencing hardware and software selection, analyzing network traffic, and collaborating with various stakeholders. Innovation, championing new features, and mentoring junior engineers are also part of the role. This position may require occasional travel.

Add these skills to join the top 1% applicants for this job

cross-functional
networking
system-design

The Datacenter Systems Architecture team seeks an outstanding Cluster architect to design and optimize computer architectures specifically for high-performance computing (HPC) clusters. This position is a multi-disciplinary and cross-functional lead engineering role encompassing all aspects of computer system design. The candidate will have the skills and experience to create complex system architectures, surprise and delight our customers, and advance our products’ performance, size, power, thermal and cost goals. As a technical specialist, negotiate and document the solution details of the infrastructure from a physical, electrical, and logical perceptive of compute clusters within the datacenter. Collaborate and leverage domain expertise knowledge to provide guidance, and leadership to cross-functional engineering teams to integrate cluster network architectures into overall system architecture to ensure efficient data flow, impact product definitions, and meet scalability requirements. Define the rack and cluster capabilities, configurations, and scale out requirements to support the deployment of dense compute and specialty compute workloads and applications, including but not limited to the following: Pathfinding on novel cluster architecture choices with a broad group of architects and system engineers, networking, technical leads, and HW/SW stakeholders. Creating optimized network designs for large-scale AI/ML clusters considering factors like bandwidth, latency, and scalability. Influencing networking hardware and software components selection for the cluster, including switches, adapters, and protocols. Analyzing network traffic patterns and implementing strategies to improve data transfer speeds within the cluster for target topologies and choice configurations. Collaborate with mechanical, physical, electrical, thermal, power, networking, OS, SW, datacenter infrastructure stakeholders for performant scalable deployments. Be innovative and curious. Explore and champion new product-level features and workflows. Define, develop and utilize tools, scripts, automation and methods of system analysis for performance of compute clusters within a DC environment. Mentor junior engineers to best practices and data-driven processes The role may require occasional domestic and international travel.

Set alerts for more jobs like Data Center Cluster Architect
Set alerts for new jobs by Apple
Set alerts for new Data Analysis jobs in United States
Set alerts for new jobs in United States
Set alerts for Data Analysis (Remote) jobs

Contact Us
hello@outscal.com
Made in INDIA 💛💙