AI Senior Data Engineer (R-18545)
dun bradstreet
Job Summary
We are seeking a highly skilled Senior AI Data Engineer to architect and build scalable data pipelines and infrastructure for AI and Generative AI applications. This role focuses on managing unstructured data, embedding generation, and vector database operations to support advanced AI workflows such as Retrieval-Augmented Generation (RAG), semantic search, and agentic systems. You will work closely with AI engineers, data scientists, and platform architects to deliver high-performance, secure, and resilient data systems that power enterprise-grade AI solutions.
Must Have
- Design and implement ETL/ELT pipelines for structured and unstructured data, optimized for embedding generation and vector indexing.
- Build and maintain embedding pipelines using OpenAI, Hugging Face, or Vertex AI models.
- Develop and manage vector databases for semantic search and similarity matching.
- Integrate metadata-aware filtering and real-time upserts for contextual and dynamic querying.
- Collaborate with platform teams to ensure secure provisioning, RBAC, encryption, and compliance with data governance policies.
- Optimize performance using GPU acceleration, tiered storage, and ANN indexing algorithms.
- Support observability, monitoring, and incident response for data pipelines and vector infrastructure.
- Contribute to the AIBE Data Hub and RAG-as-a-Service initiatives.
- Bachelor’s or master’s degree in computer science, Data Engineering, or related field.
- 8+ years of experience in data engineering, with at least 3 years managing unstructured data and embedding pipelines.
- Strong proficiency in Python and data pipeline tools.
- Experience with embedding models and vector search platforms.
- Deep understanding of unstructured data processing, especially text and PDFs.
- Familiarity with cloud platforms (Azure, GCP, AWS) and services like Databricks, Vertex AI, and Dataflow.
- Knowledge of containerization and microservices (Docker, Kubernetes).
- Experience with CI/CD pipelines, Terraform, and DevSecOps practices.
Good to Have
- Hands-on experience with hybrid search and agentic workflows.
- Familiarity with multimodal analytics and joining structured/unstructured data for advanced use cases.
- Contributions to platform-level services such as RAG-as-a-Service and embedding generation frameworks.
Perks & Benefits
- 25 days annual leave (plus 2 paid volunteer days & 1 paid un-sick day)
- Holiday buy & sell (the option to buy or sell up to 5 additional days per year)
- Flexible working - hybrid model
- Employee Health Insurance
- Mental Health Support program
- Pension Contribution
- Family Friendly Leave (Maternity, Paternity, Parental, Marriage and Bereavement)
- Life Assurance
- Educational Assistance Program
- Life-Style Account (D&B will match your contributions up to €40 per month and can be used to claim for a range of health-related, leisure or lifestyle activities)
Job Description
Why We Work at Dun & Bradstreet
Dun & Bradstreet combines global data and local expertise to help clients make smarter decisions. With 6,000+ people in 31 countries, we are a team of diverse thinkers and problem solvers who all share a common curiosity: to find new ways to turn data into value. If you share this curiosity and want to be part of a future-ready company, come join us! Learn more at dnb.com/careers
.
We are seeking a highly skilled Senior AI Data Engineer to architect and build scalable data pipelines and infrastructure for AI and Generative AI applications. This role focuses on managing unstructured data, embedding generation, and vector database operations to support advanced AI workflows such as Retrieval-Augmented Generation (RAG), semantic search, and agentic systems.
You will work closely with AI engineers, data scientists, and platform architects to deliver high-performance, secure, and resilient data systems that power enterprise-grade AI solutions.
What’s on Offer at D&B Ireland
- 25 days annual leave (plus 2 paid volunteer days & 1 paid un-sick day)
- Holiday buy & sell (the option to buy or sell up to 5 additional days per year)
- Flexible working - hybrid model
- Employee Health Insurance
- Mental Health Support program
- Pension Contribution
- Family Friendly Leave (Maternity, Paternity, Parental, Marriage and Bereavement)
- Life Assurance
- Educational Assistance Program
- Life-Style Account (D&B will match your contributions up to €40 per month and can be used to claim for a range of health-related, leisure or lifestyle activities)
At Dun & Bradstreet, we are 6,000 friendly colleagues around the world waiting to meet you and give you the opportunity to grow your career.
As part of the RDI team, you will:
- Design and implement ETL/ELT pipelines for structured and unstructured data, optimized for embedding generation and vector indexing.
- Build and maintain embedding pipelines using OpenAI, Hugging Face, or Vertex AI models
- Develop and manage vector databases for semantic search and similarity matching.
- Integrate metadata-aware filtering and real-time upserts for contextual and dynamic querying.
- Collaborate with platform teams to ensure secure provisioning, RBAC, encryption, and compliance with data governance policies.
- Optimize performance using GPU acceleration, tiered storage, and ANN indexing algorithms.
- Support observability, monitoring, and incident response for data pipelines and vector infrastructure.
- Contribute to the AIBE Data Hub and RAG-as-a-Service initiatives.
About you:
- Bachelor’s or master’s degree in computer science, Data Engineering, or related field.
- 8+ years of experience in data engineering, with at least 3 years managing unstructured data and embedding pipelines
- Strong proficiency in Python, and data pipeline tools (Dataflow, Pub/Sub, Cloud Functions, Cloud Storage, Apache beam, Air Flow, Vertex AI Pipelines, Kafka, Flink).
- Experience with embedding models and vector search platforms (e.g., BigQuery Vector Search, QDRANT)
- Deep understanding of unstructured data processing, especially text, PDFs, etc.
- Familiarity with cloud platforms (Azure, GCP, AWS) and services like Databricks, Vertex AI, and Dataflow.
- Knowledge of containerization and microservices (Docker, Kubernetes).
- Experience with CI/CD pipelines, Terraform, and DevSecOps practices.
Desirable Experience
- Hands-on experience with hybrid search and agentic workflows.
- Familiarity with multimodal analytics and joining structured/unstructured data for advanced use cases.
- Contributions to platform-level services such as RAG-as-a-Service and embedding generation frameworks.
We appreciate you may not meet all listed criteria above, but if you have the passion and eagerness to learn and grow, we want to hear from you!!
All employees and contractors working in D&B should be aware that they have responsibilities in relation to the Company’s Business Management System. This relates to information and its security, quality, environment and health and safety both during and post-employment with D&B.
Dun & Bradstreet is an Equal Opportunity Employer
All Dun & Bradstreet job postings can be found at https://jobs.lever.co/dnb
. Official communication from Dun & Bradstreet will come from an email address ending in @dnb.com.
Notice to Applicants: Please be advised that this job posting page is hosted and powered by Lever, a subsidiary of Employ Inc. Your use of this page is subject to Employ's Privacy Notice
and Cookie Policy
, which governs the processing of visitor data on this platform.
#LI-DNI
We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please visit https://bit.ly/3LMn4CQ.