Senior Research Engineer (Data)

3 Months ago • 3 Years + • Artificial Intelligence • Data Analyst • $175,000 PA - $250,000 PA

Job Summary

Job Description

This Senior Research Engineer (Data) role focuses on spearheading data acquisition and management systems for advanced AI research. Responsibilities include architecting and maintaining efficient data pipelines for sourcing, processing, and organizing large datasets used in generative AI models. The role requires partnering with research teams to improve model performance by identifying and leveraging novel data sources, developing robust data pipelines (including deduplication and filtering), collaborating with annotation teams to enhance dataset quality, applying advanced methodologies like self-supervised active learning, and leading research projects to improve data quality for video generation models. The ideal candidate will have 3+ years of experience managing large-scale datasets in fields like computer vision or NLP, strong Python and PyTorch skills, and experience with large-scale data processing tools like SQL or Spark.
Must have:
  • 3+ years experience managing large datasets
  • Strong Python & PyTorch proficiency
  • Experience with SQL or Spark
  • Expertise in designing distributed systems
  • Data pipeline development & maintenance
Perks:
  • Competitive equity packages
  • Comprehensive benefits plan

Job Details

We are seeking a Senior Software Engineer to spearhead our data acquisition and management systems, critical to our advanced AI research. In this role, you will architect and maintain efficient pipelines for sourcing, processing, and organizing the extensive datasets that fuel our generative AI models. Your expertise will have a direct and transformative impact on the quality and capabilities of our technology.

Responsibilities

  • Partner with research teams to understand and address model performance gaps by identifying and leveraging novel data sources.
  • Develop and implement robust data pipelines for acquisition, deduplication, filtering, and pre-training dataset preparation.
  • Collaborate with annotation operations teams to design innovative data filtering strategies and enhance dataset quality.
  • Apply and integrate advanced methodologies such as self-supervised active learning to scale data systems.
  • Lead research projects to improve data quality and drive advancements in video generation models.

Qualifications

  • Education: Bachelor’s or Master’s degree in Computer Science, Software Engineering, or a related field.
  • Experience: 3+ years of experience in managing and curating large-scale datasets, particularly in fields like computer vision, NLP, robotics, or self-driving technologies.

Key Skills:

  • Strong proficiency in Python and familiarity with deep learning frameworks such as PyTorch.
  • Experience with large-scale data processing tools, such as SQL or Spark.
  • Hands-on expertise in designing and working with distributed systems.
  • Proven ability to thrive in a fast-paced, research-focused environment and deliver end-to-end project solutions.

Note: This position is not intended for recent graduates.

Compensation

The salary range for this role in California is $175,000–$250,000 per year. Actual base pay may vary based on factors such as job-related expertise, skills, experience, and candidate location. Additionally, we provide competitive equity packages through stock options and a comprehensive benefits plan.

Similar Jobs

Nielsen Holdings - Senior Software Engineer (Java/Scala, Spark, Kubernetes, AWS)

Nielsen Holdings

Bengaluru, Karnataka, India (Hybrid)
5 Months ago
Egnyte - Sr. Corporate Development Analyst

Egnyte

California, United States (On-Site)
2 Months ago
PwC - IN-Senior Associate_PySpark Developer_Data & Analytics_Advisory_ PAN India

PwC

Gurugram, Haryana, India (On-Site)
5 Months ago
Nielsen Holdings - DevOps Engineer (Terraform, Jenkins, GitLab CI/CD, Python, Airflow)

Nielsen Holdings

Bengaluru, Karnataka, India (Hybrid)
5 Months ago
Netflix - Principal Product Manager, Data Platform - Analytics Platform

Netflix

United States (Remote)
5 Months ago
My Fitness Pal - Senior AI Engineer

My Fitness Pal

United States (Remote)
2 Months ago
Google DeepMind - Research Scientist, Language

Google DeepMind

Bengaluru, Karnataka, India (Hybrid)
6 Months ago
NVIDIA - LLM Application Intern, AV Infrastructure - 2025

NVIDIA

Shanghai, Shanghai, China (On-Site)
2 Months ago
Saama Technologies,  Inc  - Business Systems Analyst (Data Science, RAG, LLM, OpenAI)

Saama Technologies, Inc

California, United States (Remote)
1 Month ago
NetEase Games - Game AI Research Leader

NetEase Games

Singapore (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

GoTo Group - Software Engineer - Data Science Platform

GoTo Group

Jakarta, Jakarta, Indonesia (On-Site)
5 Months ago
Epic Games - Senior Data Analyst, Unreal Engine & Creator Products

Epic Games

Cary, North Carolina, United States (On-Site)
2 Months ago
Info Stretch - Lead Data Engineer

Info Stretch

Hyderabad, Telangana, India (On-Site)
5 Months ago
Dream Sports - SDE 2 - ML & Data Platform

Dream Sports

Mumbai, Maharashtra, India (On-Site)
6 Months ago
Nielsen Holdings - Senior Data Developer

Nielsen Holdings

Mexico City, Mexico City, Mexico (Remote)
1 Month ago
Unity - Software Engineer, Data Engineering

Unity

Copenhagen, Denmark (On-Site)
5 Months ago
ByteDance - Big Data Engineer, Data Lake / Feature Store

ByteDance

Singapore (On-Site)
5 Months ago
Globalization Partners - Staff Data Engineer

Globalization Partners

(Remote)
1 Month ago
King - Senior Staff Software Engineer (Data)  - Activision Blizzard Media

King

San Francisco, California, United States (On-Site)
8 Months ago
Second Dinner - Senior Data Engineer

Second Dinner

United States (Remote)
1 Month ago

Get notifed when new similar jobs are uploaded

Jobs in Palo Alto, California, United States

The Walt Disney Company - Data Governance Engineer

The Walt Disney Company

New York, New York, United States (On-Site)
1 Month ago
Microsoft - Research Intern - Cryptography

Microsoft

Redmond, Washington, United States (On-Site)
2 Months ago
NVIDIA - Senior Silicon Low Power Development Engineer

NVIDIA

Santa Clara, California, United States (Hybrid)
2 Months ago
Meta - Research Scientist, Computer Vision for Generative AI (PhD)

Meta

Seattle, Washington, United States (On-Site)
4 Months ago
BigID - Senior Services Operations Analyst

BigID

Austin, Texas, United States (Remote)
4 Months ago
Netflix - Manager, Partner Analytics & Infrastructure

Netflix

Los Gatos, California, United States (On-Site)
3 Months ago
The Walt Disney Company - Manager, Incident Management

The Walt Disney Company

San Antonio, Texas, United States (On-Site)
1 Month ago
NVIDIA - Senior System Software Engineer Platform - Server Embedded Firmware

NVIDIA

Santa Clara, California, United States (On-Site)
1 Month ago
ByteDance - Senior Software Engineer, Payment Solution

ByteDance

San Jose, California, United States (On-Site)
5 Months ago
Zoox - Executive Sourcer (Contract)

Zoox

Foster City, California, United States (Hybrid)
5 Months ago

Get notifed when new similar jobs are uploaded

Artificial Intelligence Jobs

Google - Research Intern, PhD, Summer 2025

Google

Mountain View, California, United States (On-Site)
4 Months ago
Interface AI - Sr. Implementation Engineer

Interface AI

United States (Remote)
4 Months ago
Casumo - AI Engineer

Casumo

(Hybrid)
3 Months ago
Microsoft - Senior Applied Scientist- BING Ads

Microsoft

Beijing, Beijing, China (On-Site)
3 Months ago
Zoox - Software Engineer - Perception & Sensing

Zoox

Foster City, California, United States (Hybrid)
5 Months ago
Microsoft - Research Intern - Interactive Entertainment with Generative AI

Microsoft

Redmond, Washington, United States (On-Site)
3 Months ago
Nagarro - Associate Principal Consultant - Business Analyst

Nagarro

Colombia (Remote)
1 Month ago
Microsoft - AI Platform Engineer

Microsoft

Mountain View, California, United States (Hybrid)
2 Months ago
Nextbrain - Computer Vision Engineer

Nextbrain

Bengaluru, Karnataka, India (On-Site)
4 Months ago
PwC - AI Engineer (Python + GenAI) (freelance)

PwC

Warsaw, Masovian Voivodeship, Poland (On-Site)
5 Months ago

Get notifed when new similar jobs are uploaded

About The Company

An idea-to-video platform that brings your creativity to motion.

Palo Alto, California, United States (On-Site)

Palo Alto, California, United States (On-Site)

Palo Alto, California, United States (On-Site)

Palo Alto, California, United States (On-Site)

Palo Alto, California, United States (On-Site)

Palo Alto, California, United States (On-Site)

Palo Alto, California, United States (On-Site)

Palo Alto, California, United States (On-Site)

Palo Alto, California, United States (On-Site)

Palo Alto, California, United States (On-Site)

View All Jobs

Get notified when new jobs are added by Pika

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug