Senior Data Engineer

8 Minutes ago • All levels • Data Analysis

Job Summary

Job Description

Toppan Merrill is seeking a Senior Data Engineer to develop and optimize ETL/ELT data pipelines using Azure Databricks and Apache Spark (PySpark). The role involves implementing Unity Catalog for data governance, integrating with cloud data platforms like Azure Data Lake Storage, and automating workflows with CI/CD pipelines. The engineer will collaborate with data analysts, scientists, and business teams to translate requirements into scalable data engineering solutions, ensuring data quality and secure data sharing.
Must have:
  • Develop, test, and maintain ETL/ELT data pipelines using Azure Databricks & Apache Spark (PySpark).
  • Optimize performance and cost-efficiency of Spark jobs.
  • Ensure data quality through validation, monitoring, and alerting mechanisms.
  • Design and enforce access control policies using Unity Catalog.
  • Manage data lineage, auditing, and metadata governance.
  • Work with Azure Data Lake Storage / Azure Blob Storage/ Azure Event Hub.
  • Implement Delta Lake for scalable, ACID-compliant storage.
  • Develop CI/CD pipelines for data workflows using Azure Databricks Workflows or Azure Data Factory.
  • Monitor and troubleshoot failures in job execution and cluster performance.
  • Proficiency in writing optimized and maintainable Python code.
  • Strong knowledge of SQL for data transformations and optimizations.
  • Familiarity with Databricks CLI, Databricks DABs, and DevOps principles.
  • Knowledge of IAM, role-based access control (RBAC), and encryption.
  • Ability to pull data from a wide variety of APIs.
Good to have:
  • Experience with MLflow for model tracking & deployment in Databricks.
  • Familiarity with streaming technologies (Kafka, Delta Live Tables, Azure Event Hub, Azure Event Grid).
  • Hands-on experience with dbt (Data Build Tool) for modular ETL development.
  • Certification in Databricks, Azure.
  • Experience with Azure Databricks Lakehouse connectors for SalesForce and SQL Server.
  • Experience with Azure Synapse Link for Dynamics, dataverse.
  • Familiarity with other data pipeline strategies, like Azure Functions, Fabric, ADF.

Job Details

Job Description:

Responsibilities:

  • Develop & Optimize Data Pipelines
  • Build, test, and maintain ETL/ELT data pipelines using Azure Databricks & Apache Spark (PySpark).
  • Optimize performance and cost-efficiency of Spark jobs.
  • Ensure data quality through validation, monitoring, and alerting mechanisms.
  • Understand cluster types, configuration, and use-case for serverless
  • Implement Unity Catalog for Data Governance
  • Design and enforce access control policies using Unity Catalog.
  • Manage data lineage, auditing, and metadata governance.
  • Enable secure data sharing across teams and external stakeholders.
  • Integrate with Cloud Data Platforms
  • Work with Azure Data Lake Storage / Azure Blob Storage/ Azure Event Hub to integrate Databricks with cloud-based data lakes, data warehouses, and event streams.
  • Implement Delta Lake for scalable, ACID-compliant storage.
  • Automate & Orchestrate Workflows
  • Develop CI/CD pipelines for data workflows using Azure Databricks Workflows or Azure Data Factory.
  • Monitor and troubleshoot failures in job execution and cluster performance.
  • Collaborate with Stakeholders
  • Work with Data Analysts, Scientists, and Business Teams to understand requirements.
  • Translate business needs into scalable data engineering solutions.
  • API expertise
  • Ability to pull data from a wide variety of APIs using different strategies and methods

Required Skills & Experience:

  • Azure Databricks & Apache Spark (PySpark) – Strong experience in building distributed data pipelines.
  • Python – Proficiency in writing optimized and maintainable Python code for data engineering.
  • Unity Catalog – Hands-on experience implementing data governance, access controls, and lineage tracking.
  • SQL – Strong knowledge of SQL for data transformations and optimizations.
  • Delta Lake – Understanding of time travel, schema evolution, and performance tuning.
  • Workflow Orchestration – Experience with Azure Databricks Jobs or Azure Data Factory.
  • CI/CD & Infrastructure as Code (IaC) – Familiarity with Databricks CLI, Databricks DABs, and DevOps principles.
  • Security & Compliance – Knowledge of IAM, role-based access control (RBAC), and encryption.

Preferred Qualifications:

  • Experience with MLflow for model tracking & deployment in Databricks.
  • Familiarity with streaming technologies (Kafka, Delta Live Tables, Azure Event Hub, Azure Event Grid).
  • Hands-on experience with dbt (Data Build Tool) for modular ETL development.
  • Certification in Databricks, Azure is a plus.
  • Experience with Azure Databricks Lakehouse connectors for SalesForce and SQL Server
  • Experience with Azure Synapse Link for Dynamics, dataverse
  • Familiarity with other data pipeline strategies, like Azure Functions, Fabric, ADF, etc

Soft Skills:

  • Strong problem-solving and debugging skills.
  • Ability to work independently and in teams.
  • Excellent communication and documentation skills.

Similar Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Similar Skill Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Jobs in Chennai, Tamil Nadu, India

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Data Analysis Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

About The Company

Toppan Merrill is the best in class partner for complex, securecommunications thatdelivers premier technology driven solutions to more efficiently and accurately communicate mission critical content.We are: ​Trusted, Responsive, Expert and Human​Toppan Merrill is built on what today's clients demand and tomorrow's clients require — a responsive partnership, rooted in deep market expertise, modern agile solutions built around your business needs, and a commitment to forward-thinking technology that ensures speed, precision, and accuracy.

Chennai, Tamil Nadu, India (On-Site)

New York, New York, United States (Hybrid)

Chennai, Tamil Nadu, India (Remote)

New York, United States (Remote)

Chennai, Tamil Nadu, India (On-Site)

Saint Paul, Minnesota, United States (Remote)

Sartell, Minnesota, United States (On-Site)

Chennai, Tamil Nadu, India (On-Site)

Chennai, Tamil Nadu, India (On-Site)

View All Jobs

Get notified when new jobs are added by Toppan MErril

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug
Contact Us
hello@outscal.com
Made in INDIA 💛💙