Principal Engineer - Databricks

Flexera Software

12+ Years | Bangalore, Karnataka, India (Remote) | Full Time | 3 months ago

Apply Now

Job Summary

Flexera is seeking a Principal Engineer with deep expertise in Databricks to architect and optimize big data platforms. This role involves driving the core design and performance of the lakehouse, enabling analytics, AI/ML, and product innovation at scale. The senior technical leader will own key components of the data platform, collaborating with architects, data scientists, and engineers to deliver high-performance, scalable solutions, focusing on modern data engineering, distributed systems, and cloud-native architectures.

Must Have

Architect, design, and implement scalable and performant Databricks-based data pipelines.
Optimize Databricks Lakehouse architecture for cost efficiency, data reliability, and query performance.
Define and implement advanced data models and ontologies for complex analytics and AI/ML.
Design and tune complex Spark applications (PySpark and/or Scala) for high-volume data processing.
Implement orchestration workflows (Databricks Workflows, Azure Data Factory, or similar).
Drive best practices for data governance, security, and lineage using Unity Catalog.
Optimize cluster configurations, autoscaling strategies, and cost management within Databricks.
Collaborate with data scientists and analytics engineers for self-service data access.
Lead complex troubleshooting, performance tuning, and root-cause analysis across distributed systems.
Author and maintain technical artifacts like architecture diagrams and design docs.
Mentor and coach engineers on Databricks internals, Spark performance tuning, and data engineering.
Bachelor’s or higher degree in Computer Science or related field.
12+ years of software engineering experience, with 6+ years in large-scale data platforms.
Expert-level proficiency in Databricks, including Delta Lake, Unity Catalog, and Spark tuning.
Strong expertise with cloud-native data engineering in Azure or AWS.
Knowledge of streaming architectures (Apache Kafka, structured streaming).
Hands-on experience with metadata catalogs, orchestration frameworks, and observability tooling.
Strong understanding of distributed systems, fault tolerance, and performance engineering.
Proven ability to solve complex, ambiguous problems and make architectural trade-offs.
Excellent technical communication skills.
Prior experience mentoring engineers and setting technical direction.

Good to Have

Experience with Power BI Direct Lake mode and semantic modeling for lakehouse-based analytics.
Familiarity with Mosaic AI, Databricks GenAI capabilities, or Feature Store implementations.
Contributions to open-source Spark, Databricks community content, or certifications (e.g., Databricks Certified Data Engineer Professional).

Perks & Benefits

Fun and engaged hybrid working environment where collaboration and innovation thrive.
Value diversity and encourage applicants from underrepresented groups in technology.
Opportunity to contribute to a world-class global product.
Opportunities for career growth and professional development.
Encourage continuous learning.
Equal opportunity employer.
Diverse, equitable, and inclusive workforce.

Job Description

Flexera saves customers billions of dollars in wasted technology spend. A pioneer in Hybrid ITAM and FinOps, Flexera provides award-winning, data-oriented SaaS solutions for technology value optimization (TVO), enabling IT, finance, procurement and cloud teams to gain deep insights into cost optimization, compliance and risks for each business service. Flexera One solutions are built on a set of definitive customer, supplier and industry data, powered by our Technology Intelligence Platform, that enables organizations to visualize their Enterprise Technology Blueprint™ in hybrid environments—from on-premises to SaaS to containers to cloud.

We’re transforming the software industry. We’re Flexera. With more than 50,000 customers across the world, we’re achieving that goal. But we know we can’t do any of that without our team. Ready to help us re-imagine the industry during a time of substantial growth and ambitious plans? Come and see why we’re consistently recognized by Gartner, Forrester and IDC as a category leader in the marketplace. Learn more at flexera.com

At Flexera, we empower global enterprises by turning IT insights into action. We’re seeking a Principal Engineer with deep expertise in architecting and optimizing big data platforms, with a strong focus on Databricks. You’ll drive the core design and performance of our lakehouse, enabling analytics, AI/ML, and product innovation at scale. As a senior technical leader, you’ll own key components of our data platform — collaborating with architects, data scientists, and engineers to deliver high-performance, scalable solutions. The ideal candidate is a Databricks expert, fluent in modern data engineering, distributed systems, and cloud-native architectures.

What you'll do:

Architect, design, and implement scalable and performant Databricks-based data pipelines, including batch and streaming workloads.
Optimize Databricks Lakehouse architecture — including Delta Lake, Unity Catalog, and Photon — for cost efficiency, data reliability, and query performance at scale.
Define and implement advanced data models and ontologies to support complex analytics and AI/ML workflows.
Design and tune complex Spark applications (PySpark and/or Scala) for high-volume data processing, including structured streaming and incremental processing.
Implement orchestration workflows (e.g., Databricks Workflows, Azure Data Factory, or similar) to automate data pipelines with robust monitoring and alerting.
Drive best practices for data governance, security, and lineage, leveraging Unity Catalog, table ACLs, and audit capabilities.
Optimize cluster configurations, autoscaling strategies, and cost management within Databricks environments.
Collaborate with data scientists and analytics engineers to enable self-service access to high-quality, well-documented datasets and features.
Lead complex troubleshooting efforts, deep performance tuning, and root-cause analysis across distributed systems.
Author and maintain technical artifacts, including architecture diagrams, design docs, reference implementations, and best practice guides.
Mentor and coach engineers across teams on Databricks internals, Spark performance tuning, and modern data engineering architectures.
Stay on top of emerging Databricks features (e.g., Mosaic AI, serverless compute, Delta Live Tables), advocating for their adoption when they bring tangible value.

You'll be expected to have:

Bachelor’s or higher degree in Computer Science, Software Engineering, or a related field.
12+ years of software engineering experience, with 6+ years specifically in building large-scale data platforms.
Expert-level proficiency in Databricks, including:
Delta Lake internals (ACID transactions, time travel, Z-ordering, data compaction strategies)
Unity Catalog for data governance, security, and lineage
Advanced Spark tuning (shuffle management, partitioning strategies, caching, cluster configuration)
Databricks SQL Warehousing, Workflows, and MLflow integration
Strong expertise with cloud-native data engineering in Azure or AWS, including storage (ADLS/S3), networking, and IAM integration.
Knowledge of streaming architectures (Apache Kafka, structured streaming), schema evolution, and event-driven data modeling.
Hands-on experience with metadata catalogs, orchestration frameworks, and observability tooling (e.g., Datadog, Grafana).
Strong understanding of distributed systems, fault tolerance, and large-scale performance engineering.
Proven ability to solve complex, ambiguous problems and make architectural trade-offs in real-world production systems.
Excellent technical communication skills — ability to document, present, and explain technical concepts clearly to varied audiences.
Prior experience mentoring engineers, conducting design reviews, and setting technical direction across teams.
A continuous learner mindset — actively keeping pace with evolving Databricks platform capabilities and broader data engineering trends.

Preferred but not required:

Experience with Power BI Direct Lake mode and semantic modeling for lakehouse-based analytics.
Familiarity with Mosaic AI, Databricks GenAI capabilities, or Feature Store implementations.
Contributions to open-source Spark, Databricks community content, or certifications (e.g., Databricks Certified Data Engineer Professional).

At Flexera, we foster a fun and engaged hybrid working environment where collaboration and innovation thrive. We value diversity and encourage applicants from underrepresented groups in technology to apply.

Join our team to not only contribute to a world-class global product but also to grow in your career. At Flexera, we encourage continuous learning and provide opportunities for professional development.

Flexera is proud to be an equal opportunity employer. Qualified applicants will be considered for open roles regardless of age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by local/national laws, policies and/or regulations.

Flexera understands the value that results from employing a diverse, equitable, and inclusive workforce. We recognize that equity necessitates acknowledging past exclusion and that inclusion requires intentional effort. Our DEI (Diversity, Equity, and Inclusion) council is the driving force behind our commitment to championing policies and practices that foster a welcoming environment for all.

We encourage candidates requiring accommodations to please let us know by emailing careers@flexera.com.

17 Skills Required For This Role

Saas Business Models Communication Problem Solving Data Analytics Cost Management Unity Game Texts Networking Aws Azure Apache Kafka Grafana Power Bi Spark Scala Sql Photon

Similar Jobs

Data Analysis

Analyst, Data Science & Analytics

TransUnion • Pune, Maharashtra, India (Hybrid)