Senior Software Engineer - Distributed Data Systems

1 Minute ago • 5 Years + • Data Analysis • $166,000 PA - $225,000 PA

Job Summary

Job Description

Databricks enables data teams to solve complex problems by providing the world's best data and AI infrastructure platform. As a Senior Software Engineer on the Runtime team, you will build next-generation distributed data storage and processing systems. These systems will support diverse workloads from ETL to data science, outperforming traditional SQL engines while offering advanced expressiveness and programming abstractions. The role focuses on scaling services and infrastructure, contributing to projects like Apache Spark, Delta Lake, and performance engineering.
Must have:
  • BS (or higher) in Computer Science, related technical field or equivalent practical experience.
  • Comfortable working towards a multi-year vision with incremental deliverables.
  • Motivated by delivering customer value and impact.
  • 5+ years of production level experience in either Java, Scala or C++.
  • Strong foundation in algorithms and data structures and their real-world use cases.
  • Experience with distributed systems, databases, and big data systems (Apache Spark, Hadoop).

Job Details

At Databricks, we are passionate about enabling data teams to solve the world's toughest problems — from making the next mode of transportation a reality to accelerating the development of medical breakthroughs. We do this by building and running the world's best data and AI infrastructure platform so our customers can use deep data insights to improve their business. Founded by engineers — and customer obsessed — we leap at every opportunity to solve technical challenges, from designing next-gen UI/UX for interfacing with data to scaling our services and infrastructure across millions of virtual machines. And we're only getting started.

Modern data analysis employs sophisticated methods such as machine learning that go well beyond the roll-up and drill-down capabilities of traditional SQL query engines. As a software engineer on the Runtime team at Databricks, you will be building the next generation distributed data storage and processing systems that can outperform specialized SQL query engines in relational query performance, yet provide the expressiveness and programming abstractions to support diverse workloads ranging from ETL to data science.

Below are some example projects:

Apache Spark™

Develop the de facto open source standard framework for big data.

Data Plane Storage

Provide reliable and high performance services and client libraries for storing and accessing humongous amount of data on cloud storage backends, e.g., AWS S3, Azure Blob Store.

Delta Lake

A storage management system that combines the scale and cost-efficiency of data lakes, the performance and reliability of a data warehouse, and the low latency of streaming. Its higher level abstractions and guarantees, including ACID transactions and time travel, drastically simplify the complexity of real-world data engineering architecture.

Delta Pipelines

It's difficult to manage even a single data engineering pipeline. The goal of the Delta Pipelines project is to make it simple and possible to orchestrate and operate tens of thousands of data pipelines. It provides a higher level abstraction for expressing data pipelines and enables customers to deploy, test & upgrade pipelines and eliminate operational burdens for managing and building high quality data pipelines.

Performance Engineering

Build the next generation query optimizer and execution engine that's fast, tuning free, scalable, and robust.

What we look for:

  • BS (or higher) in Computer Science, related technical field or equivalent practical experience.
  • Comfortable working towards a multi-year vision with incremental deliverables.
  • Motivated by delivering customer value and impact.
  • 5+ years of production level experience in either Java, Scala or C++.
  • Strong foundation in algorithms and data structures and their real-world use cases.
  • Experience with distributed systems, databases, and big data systems (Apache Spark, Hadoop).

Similar Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Similar Skill Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Jobs in Mountain View, California, United States

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Data Analysis Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

About The Company

Mountain View, California, United States (On-Site)

Bellevue, Washington, United States (On-Site)

Berlin, Berlin, Germany (On-Site)

Dallas, Texas, United States (On-Site)

Aarhus, Denmark (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

San Francisco, California, United States (On-Site)

View All Jobs

Get notified when new jobs are added by Databricks

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug