Big Data/Hadoop Engineer
Rackspace Technology
Job Summary
We are hiring for a Big Data / Hadoop Engineer to design, build, and maintain robust and scalable data pipelines for ETL/ELT processes within the Hadoop ecosystem. This role involves developing complex data processing applications using Spark (Python/Scala/Java), implementing efficient jobs with Hive and Pig, and extensive work with core Hadoop components. The engineer will also focus on performance tuning, data modeling, workflow management, and providing operational support for production Big Data platforms.
Must Have
- Design, build, and maintain robust and scalable data pipelines for ETL/ELT processes using tools within the Hadoop ecosystem.
- Develop complex data processing applications using Spark (with Python/Scala/Java).
- Implement efficient jobs using Hive and Pig.
- Work extensively with core Hadoop components including HDFS, YARN, MapReduce, Hive, HBase, Pig, Sqoop, and Flume.
- Optimize large-scale data processing jobs for speed and efficiency, troubleshooting performance bottlenecks.
- Collaborate with data scientists and analysts to design and implement optimized data models.
- Implement and manage job orchestration and scheduling using tools like Oozie or similar.
- Monitor, manage, and provide L2/L3 support for production Big Data platforms.
Job Description
We are hiring for Big Data / Hadoop Engineer
Exp - 3+yrs
Location - Bangalore ( 3days WFO)
Notice Period - Less than 30days only
. Big Data Development: Design, build, and maintain robust and scalable data pipelines for ETL/ELT processes using tools within the Hadoop ecosystem.
. Programming & Scripting: Develop complex data processing applications using Spark (with Python/Scala/Java) and implement efficient jobs using Hive and Pig.
. Hadoop Ecosystem Mastery: Work extensively with core Hadoop components including HDFS, YARN, MapReduce, Hive, HBase, Pig, Sqoop, and Flume.
. Performance Tuning: Optimize large-scale data processing jobs for speed and efficiency, troubleshooting performance bottlenecks across the cluster.
. Data Modeling: Collaborate with data scientists and analysts to design and implement optimized data models for the data lake/warehouse (e.g., Hive schema, HBase column families).
. Workflow Management: Implement and manage job orchestration and scheduling using tools like Oozie or similar workflow managers.
. Operational Support: Monitor, manage, and provide L2/L3 support for production Big Data platforms, ensuring high availability and reliability.