Jobs Courses Resources Companies Placements

Home >

Jobs >

Lead Data Engineer(pyspark)

Tide

Telangana, India (Remote)

Lead Data Engineer(pyspark)

undefined ago • 8 Years + • Data Analysis

Job Summary

Job Description

We are seeking a highly skilled and experienced Senior Data Engineer with a deep expertise in PySpark to join our ML/Data engineering team. This team is responsible for feature development, data quality checks, deploying and integrating ML models with backend services and the overall Tide platform. In this role, you will be instrumental in designing, developing, and optimizing our next-generation data pipelines and data platforms. You will work with large-scale datasets, solve complex data challenges, and contribute to building robust, scalable, and efficient data solutions that drive business value. This is an exciting opportunity for someone passionate about big data technologies, performance optimization, and building resilient data infrastructure.

Must have:

Identify, diagnose, and resolve complex performance bottlenecks in PySpark jobs and Spark clusters, leveraging Spark UI, query plans, and advanced optimization techniques.
Lead the design and implementation of highly scalable, fault-tolerant, and optimized ETL/ELT pipelines using PySpark for batch and potentially real-time data processing.
Collaborate with data scientists, analysts, and product teams to understand data requirements and design efficient data models for analytical and operational use cases.
Implement robust data quality checks, monitoring, and alerting mechanisms to ensure the accuracy, consistency, and reliability of our data assets.
Contribute to the overall data architecture strategy, evaluating new technologies and best practices to enhance our data platform's capabilities and efficiency.
Promote and enforce engineering best practices, including code quality, testing, documentation, and version control (Git). Participate actively in code reviews.
Mentor junior data engineers, share knowledge, and contribute to a culture of continuous learning and improvement within the team.
Work closely with cross-functional teams including software engineers, data scientists, product managers, and business stakeholders to deliver impactful data solutions.

Perks:

Competitive salary
Self & Family Health Insurance
Term & Life Insurance
OPD Benefits
Mental wellbeing through Plumm
Learning & Development Budget
WFH Setup allowance
15 days of Privilege leaves
12 days of Casual leaves
12 days of Sick leaves
3 paid days off for volunteering or L&D activities
Stock Options

11 skills required

11 skills required for this role

Add these skills to join the top 1% applicants for this job

cross-functional

communication

problem-solving

data-analytics

budget-management

github

game-texts

spark

git

python

sql

Job Details

ABOUT THE ROLE

In this role, you will be instrumental in designing, developing, and optimizing our next-generation data pipelines and data platforms. You will work with large-scale datasets, solve complex data challenges, and contribute to building robust, scalable, and efficient data solutions that drive business value.

This is an exciting opportunity for someone passionate about big data technologies, performance optimization, and building resilient data infrastructure.

As a Data Engineer you’ll be:

Performance Optimization: Identify, diagnose, and resolve complex performance bottlenecks in PySpark jobs and Spark clusters, leveraging Spark UI, query plans, and advanced optimization techniques (e.g., partitioning, caching, broadcasting, AQE, UDF optimization).
Design & Development: Lead the design and implementation of highly scalable, fault-tolerant, and optimized ETL/ELT pipelines using PySpark for batch and potentially real-time data processing.
Data Modeling: Collaborate with data scientists, analysts, and product teams to understand data requirements and design efficient data models (e.g., star/snowflake schemas, SCDs) for analytical and operational use cases.
Data Quality & Governance: Implement robust data quality checks, monitoring, and alerting mechanisms to ensure the accuracy, consistency, and reliability of our data assets.
Architectural Contributions: Contribute to the overall data architecture strategy, evaluating new technologies and best practices to enhance our data platform's capabilities and efficiency.
Code Review & Best Practices: Promote and enforce engineering best practices, including code quality, testing, documentation, and version control (Git). Participate actively in code reviews.
Mentorship & Leadership: Mentor junior data engineers, share knowledge, and contribute to a culture of continuous learning and improvement within the team.
Collaboration: Work closely with cross-functional teams including software engineers, data scientists, product managers, and business stakeholders to deliver impactful data solutions.

WHAT ARE WE LOOKING FOR

8+ years of professional experience in data engineering, with at least 4+ years specifically focused on PySpark development and optimization in a production environment.
Expert-level proficiency in PySpark including Spark SQL, DataFrames, RDDs, and understanding of Spark's architecture (Driver, Executors, Cluster Manager, DAG).
Strong hands-on experience with optimizing PySpark performance on large datasets, debugging slow jobs using Spark UI, and addressing common issues like data skew, shuffles, and memory management.
Excellent programming skills in Python with a focus on writing clean, efficient, and maintainable code.
Proficiency in SQL for complex data manipulation, aggregation, and querying.
Basic understanding of data warehousing concepts (dimensional modeling, ETL/ELT processes, data lakes, data marts).
Experience with distributed data storage solutions such as Delta Lake, Apache Parquet etc.
Familiarity with version control systems (Git).
Strong problem-solving abilities, analytical skills, and attention to detail.
Excellent communication and interpersonal skills, with the ability to explain complex technical concepts to both technical and non-technical audiences.
Bachelor's or Master's degree in Computer Science, Engineering, or a related quantitative field.

WHAT YOU WILL GET IN RETURN

Make work, work for you! We are embracing new ways of working and support flexible working arrangements. With our Working Out of Office (WOO) policy our colleagues can work remotely from home or anywhere in their assigned Indian state. Additionally, you can work from a different country or Indian state for 90 days of the year. Plus, you’ll get:

Competitive salary
Self & Family Health Insurance
Term & Life Insurance
OPD Benefits
Mental wellbeing through Plumm
Learning & Development Budget
WFH Setup allowance
15 days of Privilege leaves
12 days of Casual leaves
12 days of Sick leaves
3 paid days off for volunteering or L&D activities
Stock Options

Similar Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Similar Skill Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Jobs in Hyderabad, Telangana, India

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Data Analysis Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

About The Company

Tide

124 Active Jobs

Tide is the leading provider of UK small business (SMEs) accounts and one of the fastest-growing fintechs in the UK. Tide is live in the UK and India with over 650,000 members in the UK and more than 350,000 in India.

Tide is transforming the small business banking market. Our platform not only offers business accounts and related banking services, but also a comprehensive set of highly connected admin tools for businesses, such as full integration with accounting systems (live for our UK members, many are live in India, and coming soon to Germany). Using advanced technology, all solutions are designed with SMEs in mind.

With quick onboarding, low fees and innovative features, we thrive on making data-driven decisions to help SMEs save both time and money.

Get notified when new jobs are added by Tide

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

A global community of game builders. Helping people upskill and land jobs in the best gaming studios.

Company

Key Links

hello@outscal.com

Made in INDIA 💛💙

Lead Data Engineer(pyspark)

Job Summary

Job Description

11 skills required

11 skills required for this role

Job Details

ABOUT THE ROLE

WHAT ARE WE LOOKING FOR

WHAT YOU WILL GET IN RETURN

Similar Jobs

Looks like we're out of matches

Similar Skill Jobs

Looks like we're out of matches

Jobs in Hyderabad, Telangana, India

Looks like we're out of matches

Data Analysis Jobs

Looks like we're out of matches

About The Company

Lead Machine Learning Engineer (MLOps)

Analyst product, 4 - Experience Foundations