At Databricks, we are obsessed with enabling data teams to solve the world's toughest problems. We do this by building and running the world's best data and AI infrastructure platform, so our customers can focus on the high value challenges that are central to their own missions.
Founded in 2013 by the original creators of Apache Spark, Databricks has grown from a tiny corner office in Berkeley, California to a global organization with over 1000 employees. Thousands of organizations, from small to Fortune 100, trust Databricks with their mission-critical workloads, making us one of the fastest growing SaaS companies in the world.
Our engineering teams build highly technical products that fulfill real, important needs in the world. We constantly push the boundaries of data and AI technology, while simultaneously operating with the resilience, security and scale that is critical to making customers successful on our platform.
We develop and operate one of the largest scale software platforms. The fleet consists of millions of virtual machines, generating terabytes of logs and processing exabytes of data per day. At our scale, we regularly observe cloud hardware, network, and operating system faults, and our software must gracefully shield our customers from any of the above.
As a Data Scientist on the Data Team, you will help build a data-driven culture within Databricks by helping work on top priorities for the company. The Data team also functions as an in-house, production "customer" that dog foods Databricks and drives the future direction of the products.
The impact you will have:
- “Analysis at the speed of thought”: Inform decision making by building robust data science tooling for business leaders, analysts, and other data scientists.
- “Extend capabilities of Databricks”: Work closely with Data Platform and Product Engineering teams to integrate data science tooling with existing Data team offerings and the core product
- “Strategic business insights”: Lead insight generation for top company priorities, and key Engineering initiatives (reliability, and efficiency).
- Gather changing requirements, define project OKRs and milestones, and communicate progress and results to both technical and non-technical audiences.
- Mentor and guide junior data scientists on the team by helping with project planning, technical decisions, and code and document review.
- Represent the data science discipline throughout the organization, having a powerful voice to make us more data-driven.
- Represent Databricks at academic and industrial conferences & events.
What we look for:
- 7+ years of data science, machine learning, advanced analytics experience in high velocity, high-growth companies
- Extensive experience in applying Data Science / ML in production to build data-driven products for solving business problems.
- Experience collaborating with and understanding the needs of Senior level stakeholders from a variety of functions including: Engineering, Product, and Technical Operations.
- Ability to deal with ambiguity in fast paced environments by clarifying requirements and having a keen sense of 0 to 1 solutions.
- Adept at operating both as an individual contributor and identifying how to orchestrate the build through peers and investments in scalable tooling.
- Strong coding skills in Python and SQL
- Experience with distributed data processing systems like Spark and familiarity with software engineering principles around testing, code reviews and deployment.
- M.S. or Ph.D. in quantitative fields (e.g., Statistics, Math, Computer Science, Physics, Economics, Operational Research or Engineering)