Software Engineer, Cloud Infrastructure

6 Months ago • 5 Years + • DevOps

Job Summary

Job Description

As a Software Engineer, Cloud Infrastructure at Scale, you'll design and build core cloud infrastructure platforms and systems, supporting orchestration, data abstraction, and data pipelines. You'll leverage AWS, Kubernetes, Docker, Terraform, Helm, and more, working closely with stakeholders and internal customers.
Must have:
  • Cloud Infrastructure
  • AWS Experience
  • Distributed Systems
  • Software Development
Good to have:
  • Azure & GCP
  • GPU-based Compute
  • Hyper-growth Startups
  • AI Technologies
Perks:
  • AI Race Exposure
  • World-class RLHF

Job Details

Software is eating the world, but AI is eating software. We live in unprecedented times – AI has the potential to exponentially augment human intelligence. Every person will have a personal tutor, coach, assistant, personal shopper, travel guide, and therapist throughout life. As the world adjusts to this new reality, leading platform companies are scrambling to build LLMs at billion scale, while large enterprises figure out how to add it to their products. To make them safe, aligned and actually useful, these models need human eval and reinforcement learning through human feedback (RLHF) during pre-training, fine-tuning, and production evaluations. This is the main innovation that’s enabled ChatGPT to get such a large headstart among competition.

At Scale, our products include the Generative AI Data Engine, SGP, Donovan, and others that power the most advanced LLMs and generative models in the world through world-class RLHF, human data generation, model evaluation, safety, and alignment. The data we are producing is some of the most important work for how humanity will interact with AI.

At the foundation of these products is the Platform Engineering team.  In this role, you will help lead the design and development of core cloud infrastructure platforms and systems, while supporting orchestration, data abstraction, data pipelines, identity & access management, and underlying infrastructure.  You’ll also get widespread exposure to the forefront of the AI race as Scale sees it in enterprises, startups, governments, and large tech companies.

You will:

  • Own the underlying cloud infrastructure stack running on AWS leveraging Kubernetes, Docker, Terraform, Helm and other common tools and frameworks.
  • Drive the architecture, design, implementation and support of our foundational platforms and systems, working closely with stakeholders and internal customers to understand and refine requirements.
  • Collaborating with cross-functional teams to define, design, and deliver new features.
  • Proactively identifying opportunities for, and driving improvements to, current infrastructure practices, including process enhancements, tool upgrades, and cost optimizations.
  • Presenting technical information to teams and stakeholders, providing guidance and insight on development processes and technologies.

Ideally you’d have:

  • 5+ years of full-time engineering experience, post-graduation with specialities in back-end systems.
  • Extensive experience supporting cloud-based infrastructure (AWS preferred).
  • Extensive experience in software development and a deep understanding of distributed systems, cloud platforms, and software development best practices.
  • Show a track record of leading successful projects with increasing scale and scope.
  • Possess excellent communication and collaboration skills, and the ability to translate complex technical concepts to non-technical stakeholders.
  • Advanced Linux troubleshooting skills, including diagnostic experience leveraging common logging & telemetry systems, IAM management, TCP/IP and OSI proficiency.
  • Strong knowledge of software engineering best practices and CI/CD tooling.

Nice to haves:

  • Experience with Azure and GCP, and GPU-based compute.
  • Experience scaling products at hyper-growth startups.
  • Excitement to work with AI technologies.

Similar Jobs

Sonar Source - Site Reliability Engineer

Sonar Source

Bochum, North Rhine-Westphalia, Germany (On-Site)
6 Months ago
Zeta - Senior Site Reliability Engineer

Zeta

Hyderabad, Telangana, India (On-Site)
6 Months ago
DraftKings - Lead Network Engineer

DraftKings

Ireland (Remote)
2 Months ago
SparkCognition - Software Engineer (Scala_Backend)

SparkCognition

Bengaluru, Karnataka, India (On-Site)
7 Months ago
Wargaming - DevOps Engineer

Wargaming

Nicosia, Nicosia, Cyprus (On-Site)
4 Months ago
Omnissa - Member of Technical Staff (C++ Windows)

Omnissa

Chennai, Tamil Nadu, India (On-Site)
6 Months ago
Google - Data Cloud Consultant

Google

Bengaluru, Karnataka, India (On-Site)
2 Days ago
Nagarro - Staff Engineer - DevOps Site Reliability

Nagarro

Colombia (Remote)
3 Months ago
Dream Sports - SDE 2 - DevOps

Dream Sports

Mumbai, Maharashtra, India (On-Site)
2 Weeks ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Sandsoft Games - DevOps & Automation Engineer

Sandsoft Games

Barcelona, Catalonia, Spain (Hybrid)
1 Month ago
Razer - Senior API Developer

Razer

Singapore (On-Site)
7 Months ago
Zazz - Data Engineer (6–8 Years) Adhoc

Zazz

India (On-Site)
4 Months ago
Epic Games - Senior Engineer, Data Platform

Epic Games

Cary, North Carolina, United States (On-Site)
2 Days ago
NVIDIA - Senior AI-HPC Storage Engineer

NVIDIA

Austin, Texas, United States (On-Site)
2 Months ago
Addepar - Engineering Manager - OMS

Addepar

(Remote)
20 Hours ago
Netflix - Full-Stack Engineer (L5)

Netflix

Warsaw, Masovian Voivodeship, Poland (On-Site)
2 Weeks ago
Thatgamecompany - Technical Support Engineer - China

Thatgamecompany

Shanghai, Shanghai, China (On-Site)
1 Month ago
PwC - Consultant expérimenté Cloud Architect | CDI | H/F

PwC

Lyon, Auvergne-Rhône-Alpes, France (On-Site)
7 Months ago
Trend Micro - Data Scientist

Trend Micro

Manila, Metro Manila, Philippines (On-Site)
16 Years ago

Get notifed when new similar jobs are uploaded

Jobs in San Francisco, California, United States

The Walt Disney Company - Bell Services Cast Member

The Walt Disney Company

Kapolei, Hawaii, United States (On-Site)
2 Weeks ago
On Location - Senior Accountant - Olympics

On Location

New York, New York, United States (On-Site)
2 Months ago
Netflix - Administrative Assistant - Commerce Engineering

Netflix

Los Gatos, California, United States (On-Site)
1 Week ago
Universal Music - Executive Assistant to the President, Global E-Commerce and Business Development

Universal Music

Santa Monica, California, United States (On-Site)
2 Months ago
Meta - Product Design Engineer, Reality Labs

Meta

Redmond, Washington, United States (On-Site)
5 Months ago
ByteDance - Research Scientist Graduate (Computational Biology (AI-for-Science))

ByteDance

Seattle, Washington, United States (On-Site)
2 Weeks ago
Google - Senior Embedded Software Architect, Silicon

Google

Mountain View, California, United States (On-Site)
1 Week ago
Scopely - Director of Gameplay Features (Game Design)

Scopely

California, United States (Remote)
3 Months ago
Nagarro - Senior Staff Engineer, Java Developer

Nagarro

Atlanta, Georgia, United States (On-Site)
6 Months ago
SciPlay - Administrative Assistant

SciPlay

North Carolina, United States (Hybrid)
3 Weeks ago

Get notifed when new similar jobs are uploaded

DevOps Jobs

ByteDance - Site Reliability Engineer Intern

ByteDance

San Jose, California, United States (On-Site)
2 Weeks ago
Tencent - SRE Intern

Tencent

(On-Site)
2 Months ago
Wargaming - DevOps Engineer

Wargaming

Vilnius, Vilnius County, Lithuania (On-Site)
4 Months ago
N-iX - Senior Data Engineer

N-iX

Kyiv, Kyiv City, Ukraine (Hybrid)
2 Weeks ago
ByteDance - Site Reliability Engineer

ByteDance

San Jose, California, United States (On-Site)
1 Week ago
Omnissa - Staff Engineer (C++ Windows Internals)

Omnissa

Bengaluru, Karnataka, India (On-Site)
6 Months ago
Google - Senior Technical Solutions Consultant, Teamcenter

Google

Mountain View, California, United States (On-Site)
2 Weeks ago
Microsoft - Technical Support Engineer

Microsoft

(Hybrid)
2 Weeks ago
Metyis - Lead Devops Engineer

Metyis

Bengaluru, Karnataka, India (On-Site)
5 Months ago
Google - Software Engineer, Access Intelligence

Google

São Paulo, State Of São Paulo, Brazil (On-Site)
1 Week ago

Get notifed when new similar jobs are uploaded

About The Company

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

Doha, Doha Municipality, Qatar (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

View All Jobs

Get notified when new jobs are added by Scale AI

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug