Create end-to-end ETL workflows using Apache Airflow, define task dependencies and scheduling to ensure timely and accurate data processing;
Establish best practices for organizing data within S3 buckets, including folder structures, partitioning strategies, and object tagging for improved data discoverability and management;
Provide access controls and permissions management on S3 buckets and Athena tables using IAM policies, ensure compliance with regulatory requirements and data governance standards;
Collaborate with cross-functional teams, including data scientists, analysts, and business stakeholders, to understand data requirements and deliver scalable and reliable solutions;
Document data storage architectures, ETL pipelines, and query optimization techniques maintain clear and up-to-date documentation for reference and knowledge sharing within the team.
Proven experience working as a data engineer with a focus on AWS services;
Strong proficiency in building data pipelines using Apache Airflow;
Expertise in AWS services such as S3, Athena, Glue, IAM roles, Redshift;
Solid understanding of ETL processes and data modeling concepts;
Proficiency in SQL and experience working with large datasets;
Familiarity with version control systems (e.g., Git) and CI/CD pipelines;
Excellent problem-solving and troubleshooting skills;
Understanding of MLOps will be a plus;
Level of English: Intermediate/Upper-Intermediate.
Strong communication and presentation skills;
Attention to detail;
Proactive and result-oriented mindset.
GROWE TOGETHER: Our team is our main asset. We work together and support each other to achieve our common goals;
DRIVE RESULT OVER PROCESS: We set ambitious, clear, measurable goals in line with our strategy and driving Growe to success;
BE READY FOR CHANGE: We see challenges as opportunities to grow and evolve. We adapt today to win tomorrow.