Data Engineer

1 Month ago • All levels • Data Analysis

Job Summary

Job Description

Krea is building next-generation AI creative tools, focusing on making AI intuitive and controllable to empower human creativity across various formats like text, images, video, sound, and 3D. This Data Engineer role is fundamental to Krea, involving the processing of petabytes of data for AI training, analytics, and core systems. The role includes building distributed systems, working with the research team on ML pipelines, and managing massive compute on Kubernetes GPU clusters.
Must have:
  • Build distributed systems to process gigantic amounts of files (images, video, 3D data).
  • Solve scaling problems related to data processing.
  • Work closely with the research team to build ML pipelines and deploy models.
  • Play with massive amounts of compute on huge Kubernetes GPU clusters.
Good to have:
  • Python
  • PyArrow
  • DuckDB
  • SQL
  • Massive relational databases
  • PyTorch
  • Pandas
  • NumPy
  • Kubernetes
  • Designing and implementing large-scale ETL systems
  • Fundamental knowledge of containerization, operating systems, file-systems, and networking
  • Distributed systems design

Job Details

About Krea

At Krea, we are building next-generation AI creative tools.

We are dedicated to making AI intuitive and controllable for creatives. Our mission is to build tools that empower human creativity, not replace it.

We believe AI is a new medium that allows us to express ourselves through various formats—text, images, video, sound, and even 3D. We're building better, smarter, and more controllable tools to harness this medium.

This job

Data is one of the fundamental pieces of Krea. Huge amounts of data power our AI training pipelines, our analytics and observability, and many of the core systems that make Krea tick.

As a data engineer, you will…

  • … build distributed systems to process gigantic (petabytes) amounts of files of all kinds (images, video, and even 3D data). You should feel comfortable solving scaling problems as you go.
  • … work closely with our research team to build ML pipelines and deploy models to make sense of raw data.
  • … play with massive amounts of compute on huge kubernetes GPU clusters - our main GPU cluster takes up an entire datacenter from our provider.
  • … learn machine learning engineering (ML experience is a bonus, but you can also learn it on the job) from world-class researchers on a small yet highly effective tight-knit team.

Example projects

  • Find clean scenes in millions of videos, running distributed data pipelines that detect shot boundaries and saving timestamps of clips.
  • Solve orchestration and scaling issues with a large-scale distributed GPU job processing system on kubernetess.
  • Build systems to deploy and combine different LLMs to caption massive amounts of multimedia data in a variety of different ways.
  • Design multi-stage pipelines to turn petabytes of raw data into clean downstream datasets, with metadata, annotations, and filters.

Strong candidates may have experience with…

  • Python, PyArrow, DuckDB, SQL, massive relational databases, PyTorch, Pandas, NumPy…
  • Kubernetes
  • Designing and implementing large-scale ETL systems
  • Fundamental knowledge of containerization, operating systems, file-systems, and networking.
  • Distributed systems design

About us

  • We’re building AI creative tooling.
  • We’ve raised over $83M from the best investors in Silicon Valley.
  • We’re a team of 12 with millions of active users scaling aggressively.

Similar Jobs

Liquid nitro games - HR Generalist

Liquid nitro games

Hyderabad, Telangana, India (On-Site)
4 Months ago
luxsoft - Database Interface Developer

luxsoft

Mumbai, Maharashtra, India (On-Site)
1 Month ago
Unisys - Senior Cybersecurity Analyst

Unisys

Bogotá, Bogota, Colombia (On-Site)
1 Month ago
Mobilityware - Lead Data Analyst

Mobilityware

Irvine, California, United States (Hybrid)
1 Month ago
Ziff Davis - Technical Project Manager - Data Platform

Ziff Davis

Canada (Remote)
2 Months ago
Zelis  - Team Lead, Data Analysis

Zelis

Hyderabad, Telangana, India (On-Site)
1 Month ago
Self - Principal Analyst, Credit Cards

Self

Austin, Texas, United States (Remote)
1 Month ago
Moving Walls India - Data Analyst

Moving Walls India

Chennai, Tamil Nadu, India (Remote)
3 Years ago
luxsoft - Senior Data Engineer (with Backend Experience)

luxsoft

Guadalajara, Jalisco, Mexico (On-Site)
1 Month ago
Apple - Senior Software Engineer - Data Cloud

Apple

San Diego, California, United States (On-Site)
3 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Wargaming - Military Designer (World of Tanks)

Wargaming

Belgrade, Serbia (Hybrid)
2 Months ago
Lionbridge Games - Chinese to German Gaming Translator

Lionbridge Games

(Remote)
5 Months ago
Intel  - Product Development Engineering Manager/Director

Intel

Penang, Malaysia (Hybrid)
3 Months ago
Corsair - Global Sourcing Manager

Corsair

China (On-Site)
4 Months ago
ConverseNowAI - Sr QA Engineer

ConverseNowAI

Bengaluru, Karnataka, India (On-Site)
1 Year ago
Philips - Internship precision prototyping

Philips

Drachten, Friesland, Netherlands (On-Site)
2 Months ago
Haptic  - Production Director

Haptic

Sweden (On-Site)
7 Months ago
Nagarro - Senior Staff Engineer (Technical Project Manager)

Nagarro

Abu Dhabi, Abu Dhabi, United Arab Emirates (On-Site)
1 Month ago
Papaya Gaming - Unity Developer (Player Journey)

Papaya Gaming

Warsaw, Masovian Voivodeship, Poland (Hybrid)
1 Month ago
playrix  - Lead Location Designer

playrix

Ukraine (Remote)
10 Months ago

Get notifed when new similar jobs are uploaded

Jobs in San Francisco, California, United States

Splice - Senior Software Engineer II

Splice

United States (Remote)
1 Month ago
Univision - APC Operator

Univision

Los Angeles, California, United States (On-Site)
3 Months ago
SimpliSafe - Sr Security Operations Engineer - Detection Engineering

SimpliSafe

Boston, Massachusetts, United States (Hybrid)
2 Months ago
easy games - Gameplay Programmer

easy games

San Diego, California, United States (Remote)
1 Year ago
The Walt Disney Company - KABC General Assignment Reporter

The Walt Disney Company

Glendale, California, United States (On-Site)
3 Months ago
Shield AI - Staff Fluid Systems Engineer

Shield AI

Dallas, Texas, United States (On-Site)
1 Month ago
PayPal - Director Exec Development & Onboarding

PayPal

San Jose, California, United States (Hybrid)
2 Months ago
UPF Industries  - Automotive Packaging Engineer - Metal

UPF Industries

Newnan, Georgia, United States (On-Site)
3 Months ago
dun bradstreet - Early Talent Network

dun bradstreet

Jacksonville, Florida, United States (On-Site)
10 Months ago
The Walt Disney Company - Electrician - Full Time

The Walt Disney Company

Anaheim, California, United States (On-Site)
6 Months ago

Get notifed when new similar jobs are uploaded

Data Analysis Jobs

Moloco - Data Scientist II, Growth Analytics

Moloco

Berlin, Berlin, Germany (On-Site)
2 Months ago
London stock Exchange - Senior Analyst - Equity Index Data Management

London stock Exchange

Bengaluru, Karnataka, India (On-Site)
1 Month ago
Rackspace Technology - Data Engineer IV

Rackspace Technology

Gurugram, Haryana, India (Remote)
2 Months ago
HoYoverse - Data Analyst - Honkai: Star Rail - Fresh Grad

HoYoverse

Singapore, Singapore (On-Site)
3 Months ago
zoox - Data Scientist - Autonomy

zoox

Foster City, California, United States (Hybrid)
2 Months ago
Rackspace Technology - Customer Data Engineer II

Rackspace Technology

India (Remote)
4 Months ago
TransUnion - Data Analyst

TransUnion

São Paulo, Brazil (Hybrid)
1 Month ago
Oliver Agency - GenAI Creative Optimisation Analyst (Senior Data Analyst)

Oliver Agency

Jakarta, Indonesia (Hybrid)
1 Month ago
NXP - Internship – Product Engineer (Data Science)

NXP

Nijmegen, Gelderland, Netherlands (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded

About The Company

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

San Francisco, California, United States (On-Site)

View All Jobs

Get notified when new jobs are added by krea.ai

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug