Data Engineer - Consumer Data AI / ML

1 Month ago • 4 Years + • $96,000 PA - $200,000 PA
Data Analysis

Job Description

The ideal candidate will have strong AI/ML experience to design, build, and optimize scalable data pipelines and infrastructure that power advanced analytics and machine learning solutions. In this role, you will collaborate closely with data scientists, software engineers, and business stakeholders to prepare and transform large datasets, support end-to-end model development and deployment, and ensure robust, efficient, and secure data flows. You will leverage your expertise in cloud platforms, big data tools, and machine learning frameworks to drive innovation and deliver actionable insights that advance our organization’s AI initiatives and business objectives.
Good To Have:
  • Java
  • Python
  • Scala
  • SQL
Must Have:
  • Design, build, and maintain scalable data pipelines and ETL processes to support machine learning and AI initiatives on Google Cloud Platform (GCP).
  • Implement and optimize data storage solutions using GCP services such as BigQuery, Cloud Storage, and Dataflow.
  • Ensure data quality, integrity, and security throughout the data lifecycle.
  • Collaborate with data scientists, analysts, and business stakeholders to understand data requirements and deliver actionable insights.
  • Monitor, troubleshoot, and maintain the health and performance of cloud-based data infrastructure.
  • Automate manual processes and repetitive tasks to improve efficiency and reduce errors.
  • Apply data governance and compliance best practices to protect sensitive information and meet regulatory standards.
  • Stay current with new GCP features, tools, and best practices to continuously enhance data management capabilities.
  • Document solutions, processes, and architectural decisions to facilitate knowledge sharing and maintainability.
  • BS or MS in Computer Science or a related major, or equivalent experience.
  • 4+ years of software engineering experience, with a strong emphasis on system design and backend development.
  • 2+ years hands-on experience with Google Cloud Platform ecosystem (BigQuery, Dataproc, Composer, Dataflow, Data Catalog, Observability) or AWS equivalent.
  • Proven ability to design, build, and maintain data pipelines that support machine learning and AI model development, training, and deployment.
  • Familiarity with data security, compliance, and governance best practices.
  • Strong problem-solving skills, attention to detail, and ability to work collaboratively with cross-functional teams.
  • Excellent communication skills and ability to tell insightful stories using data and also manage communication within internal teams and stakeholders.
Perks:
  • flexible hybrid work options
  • healthcare
  • 401k
  • backup childcare
  • education stipends

Add these skills to join the top 1% applicants for this job

cross-functional
communication
data-analytics
game-texts
aws
google-cloud-platform
algorithms
system-design
machine-learning

It takes powerful technology to connect our brands and partners with an audience of hundreds of millions of people. Whether you’re looking to write mobile app code, engineer the servers behind our massive ad tech stacks, or develop algorithms to help us process trillions of data points a day, what you do here will have a huge impact on our business—and the world.

Summary:

The ideal candidate will have strong AI/ML experience to design, build, and optimize scalable data pipelines and infrastructure that power advanced analytics and machine learning solutions. In this role, you will collaborate closely with data scientists, software engineers, and business stakeholders to prepare and transform large datasets, support end-to-end model development and deployment, and ensure robust, efficient, and secure data flows. You will leverage your expertise in cloud platforms, big data tools, and machine learning frameworks to drive innovation and deliver actionable insights that advance our organization’s AI initiatives and business objectives.

Responsibilities:

  • Design, build, and maintain scalable data pipelines and ETL processes to support machine learning and AI initiatives on Google Cloud Platform (GCP).
  • Implement and optimize data storage solutions using GCP services such as BigQuery, Cloud Storage, and Dataflow.
  • Ensure data quality, integrity, and security throughout the data lifecycle.
  • Collaborate with data scientists, analysts, and business stakeholders to understand data requirements and deliver actionable insights.
  • Monitor, troubleshoot, and maintain the health and performance of cloud-based data infrastructure.
  • Automate manual processes and repetitive tasks to improve efficiency and reduce errors.
  • Apply data governance and compliance best practices to protect sensitive information and meet regulatory standards.
  • Stay current with new GCP features, tools, and best practices to continuously enhance data management capabilities.
  • Document solutions, processes, and architectural decisions to facilitate knowledge sharing and maintainability.

Qualifications:

  • BS or MS in Computer Science or a related major, or equivalent experience
  • 4+ years of software engineering experience, with a strong emphasis on system design and backend development.
  • 2+ years hands-on experience with Google Cloud Platform ecosystem (BigQuery, Dataproc, Composer, Dataflow, Data Catalog, Observability) or AWS equivalent.
  • Proven ability to design, build, and maintain data pipelines that support machine learning and AI model development, training, and deployment.
  • Familiarity with data security, compliance, and governance best practices.
  • Strong problem-solving skills, attention to detail, and ability to work collaboratively with cross-functional teams.
  • Excellent communication skills and ability to tell insightful stories using data and also manage communication within internal teams and stakeholders.

The material job duties and responsibilities of this role include those listed above as well as adhering to Yahoo policies; exercising sound judgment; working effectively, safely and inclusively with others; exhibiting trustworthiness and meeting expectations; and safeguarding business operations and brand integrity.

At Yahoo, we offer flexible hybrid work options that our employees love! While most roles don’t require regular office attendance, you may occasionally be asked to attend in-person events or team sessions. You’ll always get notice to make arrangements. Your recruiter will let you know if a specific job requires regular attendance at a Yahoo office or facility. If you have any questions about how this applies to the role, just ask the recruiter!

Set alerts for more jobs like Data Engineer - Consumer Data AI / ML
Set alerts for new jobs by Yahoo
Set alerts for new Data Analysis jobs in United States
Set alerts for new jobs in United States
Set alerts for Data Analysis (Remote) jobs

Contact Us
hello@outscal.com
Made in INDIA 💛💙