Senior Data Engineer

2 Months ago • 5 Years + • Data Analyst

Job Summary

Job Description

As a Senior Data Engineer at Springer Nature, you'll design, develop, and maintain scalable data pipelines on GCP using BigQuery, Python, DBT, Terraform, and GIT. You'll collaborate with data scientists and stakeholders, ensuring data quality and integrity. Responsibilities include implementing CI/CD pipelines, data modeling with DBT, and monitoring pipeline reliability. You'll work within an agile team, contributing to daily stand-ups and other ceremonies. The role involves interacting with various stakeholders and requires strong SQL, data problem-solving, and communication skills. Experience with large-scale data processing and GCP services is highly valued.
Must have:
  • 5+ years GCP Data Engineering experience
  • Proficient in Python, SQL, DBT
  • Experience with BigQuery, Terraform, GIT
  • Data pipeline design & maintenance
  • Data modeling & transformation
Good to have:
  • GCP Certifications
  • Experience with other GCP services
  • Knowledge of data governance
  • Machine learning experience
Perks:
  • Hybrid remote/office work
  • 10% time for personal projects
  • Lunch & learn sessions

Job Details

About the job

About Springer Nature Group

Springer Nature opens the doors to discovery for researchers, educators, clinicians and other professionals. Every day, around the globe, our imprints, books, journals, platforms and technology solutions reach millions of people. For over 180 years our brands and imprints have been a trusted source of knowledge to these communities and today, more than ever, we see it as our responsibility to ensure that fundamental knowledge can be found, verified, understood and used by our communities – enabling them to improve outcomes, make progress, and benefit the generations that follow. Visit group.springernature.com and follow @SpringerNature / @SpringerNatureGroup

Job Title : Senior Data Engineer

Location- Pune

Springer Nature is one of the world’s leading global research, educational and professional publishers. It is home to an array of respected and trusted brands and imprints, with more than 170 years of combined history behind them, providing quality content through a range of innovative products and services. Every day, around the globe, our imprints, books, journals and resources reach millions of people, helping researchers and scientists to discover, students to learn and professionals to achieve their goals and ambitions. The company has almost 13,000 staff in over 50 countries.

About Us

We’re looking for a Data Engineer to join Content Acquisition within Springer Nature Technology. Springer Nature is a leading publisher of scientific books, journals and magazines with over 3000 journal titles and one of the world’s largest corpora of peer-reviewed scientific text data. You would be joining the team responsible for evolving a cross- platform view (Data-as-a-Product) of our submission data. This is driving workflow management, customer experience and business reporting in up to 30 teams.

We are committed to growing and nurturing our people for the long-term. We spend 10% of our time working on our own projects to promote learning and innovation; as well as regular lunch n’ learn sessions to share knowledge.

We offer a mixed remote/office working with up to two/three days per week working from home. You'll be part of our bigger community of developers located in India, Portugal, Germany and UK.

You will be joining a cross functional team with different nationalities, backgrounds and experience levels. All team members collaborate to deliver solutions that best satisfy the needs of researchers and other readers.

Roles Responsibilities

We are looking for an experienced GCP Data Engineer with over 5 years of hands-on experience. The ideal candidate will work with Google Cloud Platform (GCP), BigQuery, Python, DBT, Terraform, and GIT.

  • Design, develop, and maintain scalable data pipelines on GCP.
  • Implement and optimize data storage solutions with BigQuery for large-scale processing.
  • Develop, test, and deploy data transformation workflows using Python.
  • Collaborate with data scientists, analysts, and stakeholders to meet data requirements.
  • Ensure data quality and integrity throughout the data lifecycle.
  • Implement CI/CD pipelines for data applications and manage infrastructure using Terraform.
  • Utilize DBT for data modelling, testing, and documentation.
  • Use GIT for version control and code collaboration.
  • Monitor and troubleshoot data pipelines to ensure reliability and performance.
  • Stay updated with industry trends and best practices in data engineering and GCP services.


Within 3 Months You Will

  • Get familiar with our emerging technology stack and data landscape.
  • Align yourself with the work of the data platform team and understand the data requirements and issues facing our users.
  • Collaborate effectively with each discipline on the team.
  • Actively participate in technical discussions and share ideas.
  • Work with architects and other data engineers in the organization to align the data processing architecture


By 3-6 Months You Will

  • Have an understanding of the team’s context within the wider organization.
  • Be a supportive member of the team, developing the platform by using the appropriate technology solutions to solve the problem at hand.
  • Triage support queries and diagnose issues in our live applications.
  • Identify new sources of data across the organization and build relationships with data providers to gain access.
  • Understand the processes by which data is acquired and any resulting limitations or bias and communicate this to the team.
  • Develop and maintain data pipelines to load data into systems like BigQuery, to analyze, clean and join datasets, in an automated, repeatable way.
  • Ensure that data is stored securely and in compliance with GDPR.
  • Work with data owners to understand how we can allow them to self-serve their data using tools we develop.


By 6-12 Months You Will

  • Develop processes and tools to monitor feeds and test data integrity and completeness and to alert users when a problem occurs.
  • Understand our customers’ needs, both internal and external, and how your work affects their experience.
  • Able to gauge the complexity or scope of a piece of work, breaking it into smaller pieces when appropriate.
  • Give and receive constructive feedback within your team.
  • Mentor other members of the team in the principles of data engineering and promote best practice.
  • Promote and advocate the use of data across Springer Nature.
  • If you have an interest in data science, there may be opportunities to apply machine learning techniques to these datasets to assist in the work of domain teams.


Day To Day Responsibilities

As part of an Agile product team, day-to-day you will:

  • Take part in our daily stand-ups.
  • Contribute to ceremonies like steering, story writing, collaborative design and retrospectives.
  • Develop new features and improve code quality by pair programming with other team members.
  • Take part in the support and monitoring of our services.
  • Interact with various stake holders where required to deliver quality products.


About You

  • Over 5 years of experience in data/software engineering on a cloud platform (AWS/GCP/Azure) using tools such as DBT and programming languages such as Python, Scala or Java.
  • You have strong SQL and data problem-solving skills.
  • Experience with data modelling and transformation tools like DBT.
  • Possess a solid understanding of modern data engineering practices.
  • You factor in non-functional aspects of data pipeline development, including quality checks, cost-effectiveness, sensitive data handling, usage monitoring, and observability of data pipelines and data quality.
  • You promote working in a cross-functional, collaborative team where there is collective code ownership.
  • You understand how your teams’ work can impact interdependent teams and design accordingly.
  • You are comfortable with making large-scale refactoring of a codebase.
  • You can facilitate and guide technical discussions to a workable outcome.
  • You enjoy mentoring team members and act as a role model on the team.
  • You understand distributed systems concepts and are familiar with the pros and cons of common data architectures, including data meshes


Good To Have

  • Expertise in GCP & BigQuery and large-scale data processing.
  • Strong Python programming skills.
  • Experience with data modelling and transformation tools like DBT.
  • Familiarity with infrastructure-as-code tools like Terraform.
  • Proficiency with GIT for version control.
  • Strong problem-solving skills and attention to detail.
  • Excellent communication and teamwork abilities.
  • GCP Certification (e.g., Professional Data Engineer).
  • Experience with other GCP services (e.g., Cloud Storage, Cloud Composer, Dataflow).
  • Knowledge of data governance and security best practices.


At Springer Nature, we value and celebrate the diversity of our people. We recognize the many benefits of a diverse workforce and strive for an inclusive workplace that empowers all our colleagues to thrive. Our search for the best talent fully encompasses and embraces these values and principles

At Springer Nature, we value the diversity of our teams and work to build an inclusive culture, where people are treated fairly and can bring their differences to work and thrive. We empower our colleagues and value their diverse perspectives as we strive to attract, nurture and develop the very best talent. Springer Nature was awarded Diversity Team of the Year at the 2022 British Diversity Awards. Find out more about our DEI work here https://group.springernature.com/gp/group/taking-responsibility/diversity-equity-inclusion

If you have any access needs related to disability, neurodivergence or a chronic condition, please contact us so we can make all necessary accommodation.

For more information about career opportunities in Springer Nature please visit https://careers.springernature.com/

Similar Jobs

Scanline VFX - Senior Pipeline Developer (Houdini)

Scanline VFX

London, England, United Kingdom (Remote)
3 Months ago
Crytek - NOC Linux Specialist (Remote only)

Crytek

(Remote)
1 Month ago
Cadence - Senior Cloud Platform Architect

Cadence

San Jose, California, United States (On-Site)
3 Months ago
The Walt Disney Company - Generalist Artist (All Levels) - ILM London

The Walt Disney Company

London, England, United Kingdom (Hybrid)
2 Months ago
Enphase Energy - Staff Engineer, Oracle APEX Development

Enphase Energy

Bengaluru, Karnataka, India (On-Site)
2 Months ago
Plarium - Marketing Data Analyst

Plarium

Herzliya, Tel Aviv District, Israel (On-Site)
2 Days ago
CloudHire - Database Solution Architect

CloudHire

Mumbai, Maharashtra, India (Remote)
3 Months ago
Zeta - Senior Business Intelligence Engineer

Zeta

Bengaluru, Karnataka, India (On-Site)
3 Months ago
Glean - Data Science Lead, Product

Glean

Palo Alto, California, United States (On-Site)
2 Months ago
The Walt Disney Company - Senior Product Manager II - Data

The Walt Disney Company

Santa Monica, California, United States (On-Site)
3 Weeks ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Ubisoft - Lead R&D Engineer

Ubisoft

Pune, Maharashtra, India (On-Site)
5 Days ago
Patreon - Senior Backend Engineer, Notifications

Patreon

San Francisco, California, United States (Hybrid)
4 Months ago
CharacterAI - Security Engineer, Product Security

CharacterAI

Menlo Park, California, United States (On-Site)
1 Month ago
Netflix - Machine Learning Intern - Spring or Summer 2025

Netflix

Los Gatos, California, United States (On-Site)
3 Months ago
Twitch - Applied Scientist - Safety ML

Twitch

San Francisco, California, United States (On-Site)
1 Month ago
Meta - Research Scientist Intern, Language and Multimodal Research for MetaAI (PhD)

Meta

Menlo Park, California, United States (On-Site)
3 Months ago
Fabric - Applied Cryptographer, ZKP Research

Fabric

France (Remote)
4 Months ago
Playtika - Youda - Data Analyst

Playtika

Netherlands (Hybrid)
3 Months ago
PwC - Senior Associate_GCP Data Engineer_Data and  Analytics_Advisory_Bengaluru

PwC

Bengaluru, Karnataka, India (On-Site)
4 Months ago
Tangelo Games - Data & Analytics Engineer

Tangelo Games

Barcelona, Catalonia, Spain (Hybrid)
1 Day ago

Get notifed when new similar jobs are uploaded

Jobs in Pune, Maharashtra, India

Dream Sports - Consultant - Sports Operations

Dream Sports

Mumbai, Maharashtra, India (On-Site)
4 Months ago
Nightfall AI - Endpoint Engineer

Nightfall AI

Bengaluru, Karnataka, India (On-Site)
3 Months ago
PwC - IN_Senior Associate – D365 Finance-Ms Dynamics– Advisory  -Hyderabad

PwC

Hyderabad, Telangana, India (On-Site)
3 Months ago
Nielsen Holdings - DevOps Engineer (Terraform, Jenkins, GitLab CI/CD, Python, Airflow)

Nielsen Holdings

Bengaluru, Karnataka, India (Hybrid)
3 Months ago
DNEG - FX Lead (DNEG Animation)

DNEG

Chennai, Tamil Nadu, India (On-Site)
3 Months ago
PwC - Manager _SAP CPI_OC_Advisory_Kolkata

PwC

Kolkata, West Bengal, India (On-Site)
4 Months ago
Purple Magic Studio - Senior Graphic Designer

Purple Magic Studio

Hyderabad, Telangana, India (On-Site)
4 Months ago
Hunarstreet Technologies   - 3D Designer & Visualizer

Hunarstreet Technologies

Mumbai, Maharashtra, India (On-Site)
10 Months ago
Extreme Network - SR PROGRAMMER 9489- EBS Applications/Oracle PL/SQL, SQL/Oracle Forms, Reports

Extreme Network

Bengaluru, Karnataka, India (Hybrid)
4 Months ago
CloudHire - Senior Database Engineer

CloudHire

India (Remote)
3 Months ago

Get notifed when new similar jobs are uploaded

Data Analyst Jobs

Universal Music - Director, Data Analytics and Research

Universal Music

Nashville, Tennessee, United States (On-Site)
4 Months ago
Trendyol - Senior Data Analyst ( Data Science - Site Management)

Trendyol

İstanbul, İstanbul, Türkiye (Hybrid)
3 Months ago
Microsoft - Senior Data Analyst

Microsoft

Vancouver, British Columbia, Canada (Remote)
2 Weeks ago
Saama Technologies,  Inc  - Technical Manager - Enterprise Datawarehouse Program

Saama Technologies, Inc

(Remote)
2 Weeks ago
King - Senior Data Scientist | Candy Crush Soda Saga

King

London, England, United Kingdom (On-Site)
3 Days ago
PwC - Data Protection Expert

PwC

Prague, Prague, Czechia (Hybrid)
2 Months ago
Netflix - People Analytics Partner

Netflix

Los Angeles, California, United States (Remote)
3 Months ago
Mattel  Inc  - Associate Category Advisor

Mattel Inc

Arkansas, United States (On-Site)
2 Months ago
Meta - Global Sales Analytics Lead

Meta

Menlo Park, California, United States (Remote)
3 Months ago
Amgen - Associate Data Scientist

Amgen

Hyderabad, Telangana, India (On-Site)
4 Months ago

Get notifed when new similar jobs are uploaded