Sr Big Data Engineer - Oozie and Pig (GCP)

3 Months ago • All levels • Data Analysis • $116,100 PA - $198,440 PA

Job Summary

Job Description

This role requires a Senior Big Data Engineer with expertise in distributed systems, batch data processing, and large-scale data pipelines. Responsibilities include designing and developing scalable batch processing systems using Hadoop, Oozie, Pig, Hive, MapReduce, and HBase; writing efficient, production-ready code; developing and optimizing complex data workflows within the Apache Hadoop ecosystem; leveraging GCP tools (Dataproc, GCS, Composer); implementing DevOps and automation best practices (CI/CD, IaC); and collaborating with cross-functional teams. The ideal candidate possesses strong hands-on experience with Oozie, Pig, the Apache Hadoop ecosystem, and programming proficiency in Java or Python. A deep understanding of data structures and algorithms is essential. This is a fully remote position.
Must have:
  • Oozie, Pig, Hadoop expertise
  • Java/Python proficiency
  • Data structure & algorithm knowledge
  • GCP experience
  • DevOps & CI/CD skills
  • Batch processing system design
  • Scalable data pipeline development
Good to have:
  • Airflow
  • BigTable
  • Redis
  • Spark

Job Details

About the Role 

We are seeking a Senior Big Data Engineer with deep expertise in distributed systems, batch data processing, and large-scale data pipelines. The ideal candidate has strong hands-on experience with Oozie, Pig, the Apache Hadoop ecosystem, and programming proficiency in Java (preferred) or Python. This role requires a deep understanding of data structures and algorithms, along with a proven track record of writing production-grade code and building robust data workflows. 

This is a fully remote position and requires an independent, self-driven engineer who thrives in complex technical environments and communicates effectively across teams. 

Work Location: US-Remote, Canada-Remote 

Key Responsibilities:

    • Design and develop scalable batch processing systems using technologies like Hadoop, Oozie, Pig, Hive, MapReduce, and HBase, with hands-on coding in Java or Python. 
    • Write clean, efficient, and production-ready code with a strong focus on data structures and algorithmic problem-solving applied to real-world data engineering tasks. 
    • Develop, manage, and optimize complex data workflows within the Apache Hadoop ecosystem, with a strong focus on Oozie orchestration and job scheduling. 
    • Leverage Google Cloud Platform (GCP) tools such as Dataproc, GCS, and Composer to build scalable and cloud-native big data solutions. 
    • Implement DevOps and automation best practices, including CI/CD pipelines, infrastructure as code (IaC), and performance tuning across distributed systems. 
    • Collaborate with cross-functional teams to ensure data pipeline reliability, code quality, and operational excellence in a remote-first environment. 

Qualifications:

    • Bachelors's degree in Computer Science, software engineering or related field of study.
    • Experience with managed cloud services and understanding of cloud-based batch processing systems are critical.
    • Proficiency in Oozie, Airflow, Map Reduce, Java.
    • Strong programming skills with Java (specifically Spark), Python, Pig, and SQL.
    • Expertise in public cloud services, particularly in GCP.
    • Proficiency in the Apache Hadoop ecosystem with Oozie, Pig, Hive, Map Reduce.
    • Familiarity with BigTable and Redis.
    • Experienced in Infrastructure and Applied DevOps principles in daily work. Utilize tools for continuous integration and continuous deployment (CI/CD), and Infrastructure as Code (IaC) like Terraform to automate and improve development and release processes.
    • Proven experience in engineering batch processing systems at scale.

The following information is required by pay transparency legislation in the following states: CA, CO, HI, NY, and WA. This information applies only to individuals working in these states.
 
·       The anticipated starting pay range for Colorado is: $116,100 - $170,280.
·       The anticipated starting pay range for the states of Hawaii and New York (not including NYC) is: $123,600 - $181,280.
·       The anticipated starting pay range for California, New York City and Washington is: $135,300 - $198,440.

Unless already included in the posted pay range and based on eligibility, the role may include variable compensation in the form of bonus, commissions, or other discretionary payments. These discretionary payments are based on company and/or individual performance and may change at any time. Actual compensation is influenced by a wide array of factors including but not limited to skill set, level of experience, licenses and certifications, and specific work location. Information on benefits  offered is here.
#LI-VM1
#LI-Remote




About Rackspace Technology
We are the multicloud solutions experts. We combine our expertise with the world’s leading technologies — across applications, data and security — to deliver end-to-end solutions. We have a proven record of advising customers based on their business challenges, designing solutions that scale, building and managing those solutions, and optimizing returns into the future. Named a best place to work, year after year according to Fortune, Forbes and Glassdoor, we attract and develop world-class talent. Join us on our mission to embrace technology, empower customers and deliver the future.
 
 
More on Rackspace Technology
Though we’re all different, Rackers thrive through our connection to a central goal: to be a valued member of a winning team on an inspiring mission. We bring our whole selves to work every day. And we embrace the notion that unique perspectives fuel innovation and enable us to best serve our customers and communities around the globe. We welcome you to apply today and want you to know that we are committed to offering equal employment opportunity without regard to age, color, disability, gender reassignment or identity or expression, genetic information, marital or civil partner status, pregnancy or maternity status, military or veteran status, nationality, ethnic or national origin, race, religion or belief, sexual orientation, or any legally protected characteristic. If you have a disability or special need that requires accommodation, please let us know.
 
 

Similar Jobs

sound cloud - Senior Machine Learning Engineer

sound cloud

Berlin, Berlin, Germany (On-Site)
2 Months ago
Bright Machines - Test & Product Development Engineering Manager (Manufacturing)

Bright Machines

San Francisco, California, United States (On-Site)
1 Month ago
Tekion Corp - Design Operation Specialist II

Tekion Corp

Bengaluru, Karnataka, India (On-Site)
1 Month ago
Kaedim - Customer Success Engineer

Kaedim

San Francisco, California, United States (On-Site)
1 Year ago
Sumo logic - Talent Pipeline - Product Engineering

Sumo logic

India (Remote)
3 Weeks ago
Easybrain - Data Scientist

Easybrain

Limassol, Limassol, Cyprus (Hybrid)
10 Months ago
Penn Interactive - Senior Data Analyst, Enterprise

Penn Interactive

Toronto, Ontario, Canada (Remote)
10 Hours ago
Arrise Solutions (India)   - Data Scientist - Recommender S/m's

Arrise Solutions (India)

Hyderabad, Telangana, India (On-Site)
10 Months ago
HCL Tech - Python Sr. Tech Lead (Data Analysis)

HCL Tech

Texas, United States (On-Site)
1 Month ago
Expedia - Data Analytics & Insights Analyst

Expedia

Gurugram, Haryana, India (On-Site)
1 Week ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

PayPal - Product Marketing Manager

PayPal

San Jose, California, United States (Hybrid)
1 Month ago
zeta - Senior Software Development Engineer - Frontend

zeta

Hyderabad, Telangana, India (On-Site)
3 Days ago
Digital extremes - Performance Marketing Manager

Digital extremes

London, Ontario, Canada (Hybrid)
2 Months ago
bytedance - Backend Software Engineer, Lemon8

bytedance

San Jose, California, United States (On-Site)
3 Days ago
Arcadia - Senior Engineering Manager - Python

Arcadia

Chennai, Tamil Nadu, India (Hybrid)
2 Months ago
Palo Alto Networks - Senior Business Development Consultant

Palo Alto Networks

Santa Clara, California, United States (Remote)
2 Weeks ago
ten square games - Project & People Manager

ten square games

Wrocław, Lower Silesian Voivodeship, Poland (Hybrid)
1 Week ago
Futuristic Labs - Prototype Engineer

Futuristic Labs

Hyderabad, Telangana, India (On-Site)
10 Months ago
PwC - Experienced Executive Assistant - Senior Associate - KSA

PwC

Riyadh, Riyadh Province, Saudi Arabia (On-Site)
9 Months ago
London stock Exchange - Senior Engineer, Site Reliability Engineering

London stock Exchange

Colombo, Western Province, Sri Lanka (Hybrid)
1 Month ago

Get notifed when new similar jobs are uploaded

Jobs in United States

Sportradar - Camera Operator

Sportradar

Bryan, Texas, United States (On-Site)
8 Months ago
Nintendo - Localization Editor

Nintendo

Redmond, Washington, United States (Hybrid)
3 Months ago
Carbon Health - Clinic Manager

Carbon Health

Rancho Cucamonga, California, United States (On-Site)
8 Hours ago
Rockstar Games - Senior Full Stack Engineer (C#/React)

Rockstar Games

Andover, Massachusetts, United States (On-Site)
2 Months ago
illumio - Staff Software Engineer, Kubernetes

illumio

Sunnyvale, California, United States (On-Site)
2 Months ago
Zinnia - Director, Client Delivery Lead

Zinnia

Greenwich, Connecticut, United States (Hybrid)
2 Months ago
GoMotive - Program Manager- Vendor Finance

GoMotive

United States (Remote)
2 Months ago
PayPal - Compliance Manager

PayPal

Scottsdale, Arizona, United States (Hybrid)
1 Week ago
Yahoo - Principal Data Engineer - Consumer Data AI / ML

Yahoo

United States (Hybrid)
1 Week ago
bytedance - Site Reliability Engineer, ML System

bytedance

Seattle, Washington, United States (On-Site)
8 Months ago

Get notifed when new similar jobs are uploaded

Data Analysis Jobs

Canonical - Lead Data Governance Engineer

Canonical

(Remote)
2 Months ago
TechVedika - Data Engineer

TechVedika

Hyderabad, Telangana, India (On-Site)
2 Weeks ago
PwC - Consultant expérimenté Data analyst | CDI | F/H

PwC

Neuilly-sur-Seine, Île-de-France, France (On-Site)
9 Months ago
Casumo - Snowflake Data Engineer

Casumo

Zagreb, Croatia (Hybrid)
1 Month ago
Ion - LCM Data Analyst

Ion

Mumbai, Maharashtra, India (On-Site)
9 Months ago
PayPal - Sr. Analyst, Data Management Oversight

PayPal

Luxembourg, Luxembourg, Luxembourg (Hybrid)
1 Day ago
Jane Street - Data Engineer

Jane Street

London, England, United Kingdom (On-Site)
2 Months ago
Devoteam - Business Analyst Government

Devoteam

Amsterdam, North Holland, Netherlands (On-Site)
1 Month ago
The Globel Talent Co - Senior Data & Marketing Analyst

The Globel Talent Co

Johannesburg, Gauteng, South Africa (Remote)
5 Months ago
Apple - Mechanical Critical Facilities Engineer, Data Center

Apple

Sparks, Nevada, United States (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded