Member of Technical Staff, High Performance Computing Engineer

1 Hour ago • 4 Years + • Artificial Intelligence

About the job

Job Description

Microsoft AI is seeking experienced High Performance Computing Engineers to contribute to the evolution of Copilot. This role involves working on large-scale supercomputers, building scalable services on public cloud infrastructure (Azure, AWS, GCP), and developing APIs to enhance Copilot's functionalities. Collaboration with product, design, and AI research teams is crucial. Responsibilities include building secure and performant AI platform services, working collaboratively with engineers and researchers, shipping high-quality code, and iteratively delivering work to users. The ideal candidate will be proactive, possess strong problem-solving skills, and thrive in a fast-paced environment.
Must have:
  • 4+ years experience in building web services (Python, C#, C++, Rust, Java)
  • 4+ years experience with high-scale training clusters (SLURM, Kubernetes, Ray)
  • 4+ years experience building scalable services on public cloud infrastructure
  • Build secure and performant AI Platform services
Good to have:
  • Experience with LLM training clusters
  • Experience with AI platforms, frameworks, and APIs
  • Experience with Machine Learning frameworks
  • Ability to identify and resolve complex technical issues
Perks:
  • Industry leading healthcare
  • Educational resources
  • Discounts on products and services
  • Savings and investments
  • Maternity and paternity leave
  • Generous time away
  • Giving programs
  • Networking opportunities

Overview

As Microsoft AI we are pushing the boundaries of technology. 

We are creating unique, beautiful and powerful products that will change lives. A small, friendly, fast-moving team, we support each other to do the best work of our lives, always looking to break new ground, fast. We are proud of what we build, how we build it and that our products will define the AI era. We run lean, obsess about users, and always make our decisions based on the evidence. We ship regularly, so your work will have real and immediate impact.

We are seeking experienced High Performance Computing Engineers to join our team and contribute to the evolution of our personal AI, Copilot. This role offers the unique opportunity to work on some of the largest scale supercomputers in the world, a rare chance to operate at such a significant scale. The right candidate will bring a wealth of positive energy, empathy, and kindness, coupled with a track record of effectiveness. You'll be proactive, relishing the challenge of crafting top-tier consumer experiences and products swiftly and efficiently. Our team is at the forefront of developing APIs that enhance our ability to fine-tune and deploy Copilot's core functionalities, in partnership with our Product Management, Design, and AI Research teams.

 

Our newly formed organization, Microsoft AI, is dedicated to advancing Copilot and other consumer AI products and research. The team is responsible for Copilot, Bing, Edge, and generative AI research. Come be a part of the team shaping the future personal computing.


Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.
 

 

By applying to this Mountain View, CA OR Redmond, WA position, you are required to be local to the San Francisco OR Seattle area and in office 3 days a week.  

 

Qualifications

Required/Minimum Qualifications:

  • Bachelor’s degree in computer science, or related technical discipline AND 4+ years technical engineering experience building web services with coding in languages including, but not limited to, Python, C#, C++, Rust, Java
    • OR equivalent experience.
  • 4+ years of experience working with high-scale training clusters (ex. working with frameworks/tools such as nvidia InfiniBand clusters, SLURM, Kubernetes, Ray, etc.)
  • 4+ years' experience building scalable services on top of public cloud infrastructure like Azure, AWS, or GCP.

Preferred Qualifications:

  • Experience with LLM training clusters. 
  • Experience working with AI platforms, frameworks, and APIs.
  • Experience using Machine Learning frameworks, including experience using, deploying, and scaling language learning models, either personally or professionally.
  • Ability to identify, analyze, and resolve complex technical issues, ensuring optimal performance, scalability, and user experience.
  • Dedication to writing clean, maintainable, and well-documented code with a focus on application quality, performance, and security.
  • Demonstrated interpersonal skills and ability to work closely with cross-functional teams, including product managers, designers, and other engineers.
  • Ability to clearly communicate complex technical concepts to both technical and non-technical stakeholders.
  • Passion for learning new technologies and staying up to date with industry trends, best practices, and emerging technologies in web development and AI.
  • Ability to work in a fast-paced environment, manage multiple priorities, and adapt to changing requirements and deadlines.
  • Proven ability to collaborate and contribute to a positive, inclusive work environment, fostering knowledge sharing and growth within the team.

 

Software Engineering IC4 - The typical base pay range for this role across the U.S. is USD $117,200 - $229,200 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $153,600 - $250,200 per year.

 

Software Engineering IC5 - The typical base pay range for this role across the U.S. is USD $137,600 - $267,000 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $180,400 - $294,000 per year.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:

Microsoft will accept applications and processes offers for these roles on an ongoing basis.

 



#Copilot #MicrosoftAI #SWE24 #SHPE24MSFT

Responsibilities

  • Build secure and performant AI Platform services that power Copilot.
  • Work collaboratively with other Platform, infrastructure, application engineers as well as AI Researchers to build next generation AI products and services.
  • Ship high-quality, well-tested, secure, and maintainable code.
  • Find a path to get things done despite roadblocks to get your work into the hands of users quickly and iteratively.
  • Enjoy working in a fast-paced, design-driven, product development cycle.
  • Embody ourand.   
Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.
Industry leading healthcare
Educational resources
Discounts on products and services
Savings and investments
Maternity and paternity leave
Generous time away
Giving programs
Opportunities to network and connect
View Full Job Description
$117.2K - $294.0K/yr (Outscal est.)
$205.6K/yr avg.
Mountain View, California, United States

Add your resume

80%

Upload your resume, increase your shortlisting chances by 80%

About The Company

Microsoft is a tech giant that develops, licenses, and supports a range of software products, services, and devices.

London, England, United Kingdom (On-Site)

Dublin, County Dublin, Ireland (On-Site)

Ho Chi Minh City, Ho Chi Minh City, Vietnam (On-Site)

San José, San José Province, Costa Rica (On-Site)

Prague, Prague, Czechia (On-Site)

View All Jobs

Get notified when new jobs are added by Microsoft

Similar Skill Jobs

Head Digital Works - Node.js Backend Developer

Head Digital Works, India (On-Site)

Hitachi - Technical Power Platform Consultant

Hitachi, France (Remote)

Info Stretch - Senior Java Engineer

Info Stretch, Germany (On-Site)

Meta - Software Engineer, Pathways Program

Meta, United States (On-Site)

The Walt Disney Company - Principal Machine Learning Engineer, Research - Ad Platforms

The Walt Disney Company, United States (On-Site)

Barracuda Networks  Inc  - Senior Software Development Engineer in Test (Sr. SDET)

Barracuda Networks Inc , India (On-Site)

Aristocrat Gaming - Chief Software Architect

Aristocrat Gaming, India (Hybrid)

Saviynt - Technical Account Manager

Saviynt, United States (Remote)

Get notifed when new similar jobs are uploaded

Jobs in Mountain View, California, United States

Rackspace Technology - Executive Communications Manager (Hybrid - San Antonio, TX)

Rackspace Technology, United States (Hybrid)

WebFX - Jr. Online Creative Designer

WebFX, United States (On-Site)

Scale AI - Technical Recruiter

Scale AI, United States (Hybrid)

Netflix - Machine Learning Manager - Promotional Media

Netflix, United States (On-Site)

Trek - Assembler

Trek, United States (On-Site)

Get notifed when new similar jobs are uploaded

Artificial Intelligence Jobs

Get notifed when new similar jobs are uploaded