Staff HPC Hardware Engineer

undefined ago • 7 Years + • Software Development & Engineering • $349,000 PA - $581,000 PA

Job Summary

Job Description

Lambda is dedicated to empowering the smartest minds in AI by providing cutting-edge HPC infrastructure. We build and scale the world's best deep learning cloud, supporting massive, world-changing AI deployments. The Engineering team is crucial for developing and maintaining our cloud offering, including the website, APIs, and internal tooling for system deployment and management. This role involves integrating compute, storage, and network hardware into our HPC platform, driving new product introduction, and ensuring system compatibility and performance.
Must have:
  • Serve as technical lead for integrating OEM and white-label hardware into Lambda’s HPC platform reference architectures.
  • Drive end-to-end NPI for hardware systems, including evaluation, validation, documentation, and production readiness.
  • Partner with architects to translate platform blueprints into concrete hardware selections and system configurations.
  • Work cross-functionally to ensure compatibility, performance, and scalability of new systems.
  • Identify and resolve hardware issues across thermal, power, firmware, and mechanical domains.
  • Provide technical guidance during vendor engagements and benchmarking of next-generation platforms.
  • 7+ years of experience in hardware integration or systems engineering for HPC, data center, or cloud infrastructure.
  • Possess deep knowledge of server hardware platforms (x86 and ARM), PCIe accelerators, storage devices, and network fabrics.
  • Experienced with vendor-led product development cycles and can drive hardware evaluation, risk mitigation, and feedback.
  • Can interpret platform-level architecture requirements and select or adapt OEM solutions to fit.
  • Comfortable working hands-on in labs with rack-scale deployments, BIOS/firmware tuning, and performance validation.
  • Collaborate well across architecture, design, engineering, and vendor teams to deliver production-ready hardware solutions.
Good to have:
  • Experience supporting AI/ML infrastructure and accelerated compute hardware (e.g., NVIDIA, AMD, Intel)
  • Familiarity with system thermals, power delivery, or integration at rack-scale
  • Exposure to BMC/Redfish/IPMI configuration and automation
  • Background in performance tuning, benchmarking, and systems validation workflows
  • Prior experience contributing to reference designs or large-scale infrastructure blueprints
Perks:
  • Generous cash & equity compensation
  • Health, dental, and vision coverage for you and your dependents
  • Wellness and Commuter stipends for select roles
  • 401k Plan with 2% company match (USA employees)
  • Flexible Paid Time Off Plan

Job Details

We're here to help the smartest minds on the planet build Superintelligence. The labs pushing the edge? They run on Lambda. Our gear trains and serves their models, our infrastructure scales with them, and we move fast to keep up. If you want to work on massive, world-changing AI deployments with people who love action and hard problems, we're the place to be.

If you'd like to build the world's best deep learning cloud, join us.

*Note: This position requires presence in our San Jose office location 4 days per week; Lambda’s designated work from home day is currently Tuesday.

Engineering at Lambda is responsible for building and scaling our cloud offering. Our scope includes the Lambda website, cloud APIs and systems as well as internal tooling for system deployment, management and maintenance.

What You’ll Do

  • Serve as the technical lead for integrating OEM and white-label compute, storage, and network hardware into Lambda’s HPC platform reference architectures.
  • Drive the end-to-end process of new product introduction (NPI) for hardware systems, including evaluation, validation, documentation, and production readiness.
  • Partner with architects to translate platform blueprints into concrete hardware selections and system configurations.
  • Work cross-functionally with design, engineering, operations, and vendor engineering teams to ensure compatibility, performance, and scalability of new systems.
  • Identify and resolve hardware issues across thermal, power, firmware, and mechanical domains during evaluation and bring-up cycles.
  • Provide technical guidance during vendor engagements and benchmarking of next-generation platforms.

You

  • Have 7+ years of experience in hardware integration or systems engineering for HPC, data center, or cloud infrastructure environments.
  • Possess deep knowledge of server hardware platforms (x86 and ARM), PCIe accelerators, storage devices, and network fabrics.
  • Are experienced with vendor-led product development cycles and can drive hardware evaluation, risk mitigation, and feedback into roadmap decisions.
  • Can interpret platform-level architecture requirements and select or adapt OEM solutions to fit.
  • Are comfortable working hands-on in labs with rack-scale deployments, BIOS/firmware tuning, and performance validation.
  • Collaborate well across architecture, design, engineering, and vendor teams to deliver complete, production-ready hardware solutions.

Nice to Have

  • Experience supporting AI/ML infrastructure and accelerated compute hardware (e.g., NVIDIA, AMD, Intel).
  • Familiarity with system thermals, power delivery, or integration at rack-scale.
  • Exposure to BMC/Redfish/IPMI configuration and automation.
  • Background in performance tuning, benchmarking, and systems validation workflows.
  • Prior experience contributing to reference designs or large-scale infrastructure blueprints.

Salary Range Information

The annual salary range for this position has been set based on market data and other factors. However, a salary higher or lower than this range may be appropriate for a candidate whose qualifications differ meaningfully from those listed in the job description.

A Final Note:

You do not need to match all of the listed expectations to apply for this position. We are committed to building a team with a variety of backgrounds, experiences, and skills.

Equal Opportunity Employer

Lambda is an Equal Opportunity employer. Applicants are considered without regard to race, color, religion, creed, national origin, age, sex, gender, marital status, sexual orientation and identity, genetic information, veteran status, citizenship, or any other factors prohibited by local, state, or federal law.

Similar Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Similar Skill Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Jobs in San Jose, California, United States

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Software Development & Engineering Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

About The Company

San Jose, California, United States (Hybrid)

Atlanta, Georgia, United States (On-Site)

San Francisco, California, United States (Hybrid)

San Jose, California, United States (Hybrid)

San Francisco, California, United States (Hybrid)

San Francisco, California, United States (Hybrid)

San Jose, California, United States (Hybrid)

San Jose, California, United States (Hybrid)

San Francisco, California, United States (Hybrid)

San Jose, California, United States (Hybrid)

View All Jobs

Get notified when new jobs are added by Lambda

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug