Atlas AI Training Data Curation Internship: Fueling Knowledge Graph Intelligence

22 Minutes ago • All levels
Research Development

Job Description

This internship with the Atlas AI team at Cognite focuses on leveraging AI to transform industrial data interactions. The intern will be responsible for the end-to-end process of curating, preparing, and fine-tuning datasets for large language models (LLMs) to generate precise queries for an industrial knowledge graph. This role is critical for enhancing the accuracy and effectiveness of knowledge graph querying capabilities, contributing to the foundational data infrastructure for next-generation AI agents. The internship spans 6-8 weeks, starting in early July, with interns working in pairs.
Good To Have:
  • Experience with cloud platforms (Google Cloud Platform, AWS, Azure) for ML workloads.
  • Familiarity with knowledge graphs, graph databases, or semantic web technologies.
  • Experience with MLOps practices or experiment tracking tools.
  • Understanding of prompt engineering and agent design principles.
Must Have:
  • Designing and implementing strategies for collecting and annotating high-quality training data.
  • Working with domain experts to ensure data accuracy and relevance.
  • Developing Python scripts for data cleaning, transformation, and formatting.
  • Ensuring data privacy and compliance standards.
  • Experimenting with fine-tuning techniques for pre-trained LLMs on curated datasets.
  • Utilizing cloud AI platforms for model training and deployment.
  • Monitoring training progress and analyzing model performance metrics.
  • Solid theoretical understanding of machine learning concepts, including NLP and LLMs.
  • Advanced proficiency in Python for data manipulation, scripting, and ML frameworks (e.g., TensorFlow, PyTorch, Hugging Face Transformers).
  • Experience with data analysis libraries (e.g., Pandas, NumPy) and data visualization.
  • Strong analytical and problem-solving abilities.
  • Ability to work effectively in a team and communicate technical concepts clearly.
Perks:
  • Join an organization of 70 different nationalities with Diversity, Equality and Inclusion (DEI) in focus.
  • A highly modern and fun working environment with sublime culture across the organization.
  • Flat structure with direct access to decision-makers, with minimal amount of bureaucracy.
  • Opportunity to work with and learn from some of the best people on some of the most ambitious projects found anywhere, across industries.
  • Join our HUB to be part of the conversation directly with Cogniters and our partners.

Add these skills to join the top 1% applicants for this job

saas-business-models
performance-analysis
data-analytics
talent-acquisition
data-structures
game-texts
aws
azure
data-visualization
google-cloud-platform
numpy
pytorch
pandas
python
tensorflow
machine-learning

Product – Engineering /

Intern /

On-site

About Cognite

Embark on a transformative journey with Cognite, a global SaaS forerunner in leveraging AI and data to unravel complex business challenges through our cutting-edge offerings including Cognite Atlas AI, an industrial agent workbench, and the Cognite Data Fusion (CDF) platform. We were awarded the 2022 Technology Innovation Leader for Global Digital Industrial Platforms & Cognite was recognized as 2024 Microsoft Energy and Resources Partner of the Year. In the realm of industrial digital transformation, we stand at the forefront, reshaping the future of Oil & Gas, Chemicals, Pharma and other Manufacturing and Energy sectors. Join us in this venture where AI and data meet ingenuity, and together, we forge the path to a smarter, more connected industrial future.

Learn more about Cognite here

Our values

Impact: Cogniters strive to make an impact in all that they do. We are result-oriented, always asking ourselves.

Ownership: Cogniters embrace a culture of ownership. We go beyond our comfort zones to contribute to the greater good, fostering inclusivity and sharing responsibilities for challenges and success.

Relentless: Cogniters are relentless in their pursuit of innovation. We are determined and deliverable (never ruthless or reckless), facing challenges head-on and viewing setbacks as opportunities for growth.

The Atlas AI team is at the forefront of leveraging AI to transform industrial data interactions. A key initiative involves fine-tuning large language models (LLMs) to generate precise queries for our industrial knowledge graph, enabling intelligent agents to extract relevant insights. The success of this endeavor heavily relies on high-quality training data.

This internship project offers an exceptional opportunity for students passionate about Machine Learning and AI to contribute to the foundational data infrastructure for our next-generation AI agents. The intern will be responsible for the end-to-end process of curating, preparing, and fine-tuning datasets, running them on our evaluation frameworks. This role is critical for enhancing the accuracy and effectiveness of our knowledge graph querying capabilities.

This internship will span 6-8 weeks, commencing in the first week of July. Interns will work collaboratively in pairs of two, fostering a dynamic and supportive learning environment.

Project Scope & Activities

  • Training Data Curation & Preparation:
  • Designing and implementing strategies for collecting and annotating high-quality training data specific to industrial knowledge graph query generation.
  • Working with domain experts to ensure the accuracy and relevance of the curated data.
  • Developing scripts and tools in Python to automate data cleaning, transformation, and formatting for model training.
  • Ensuring data privacy and compliance standards are met during curation.
  • Model Fine-tuning:
  • Experimenting with various fine-tuning techniques for pre-trained language models (LLMs) on the curated datasets.
  • Utilizing cloud AI platforms (Google Cloud AI Platform, AWS SageMaker, Azure Machine Learning) for model training and deployment.
  • Monitoring training progress, analyzing model performance metrics, and iterating on fine-tuning strategies.

Expected Outcomes

  • Successfully curated a high-quality dataset suitable for fine-tuning LLMs for knowledge graph query generation.
  • Contributed to the fine-tuning of LLMs on major cloud platforms (Google, AWS, Azure), leading to improved query generation capabilities.
  • Provided actionable insights and recommendations to the product based on data analysis and model performance.
  • Gained significant practical experience in applied machine learning, data engineering for AI, and cloud computing environments.
  • Authored clear documentation on data curation processes, fine-tuning experiments, and evaluation methodologies.

Required Skills & Qualifications

  • Machine Learning / Artificial Intelligence: Solid theoretical understanding of machine learning concepts, including natural language processing (NLP) and large language models.
  • Python: Advanced proficiency in Python for data manipulation, scripting, and ML framework utilization (e.g., TensorFlow, PyTorch, Hugging Face Transformers).
  • Data Analysis: Experience with data analysis libraries (e.g., Pandas, NumPy) and data visualization.
  • Problem-Solving: Strong analytical and problem-solving abilities, with a methodical approach to data challenges.
  • Collaboration: Ability to work effectively in a team, communicate technical concepts clearly, and adapt to evolving project requirements.

Bonus Skills (Nice to Have):

  • Experience with cloud platforms (Google Cloud Platform, AWS, Azure) for ML workloads.
  • Familiarity with knowledge graphs, graph databases, or semantic web technologies.
  • Experience with MLOps practices or experiment tracking tools.
  • Understanding of prompt engineering and agent design principles.

A snapshot of our many perks and benefits as a Cogniter

  • Join an organization of 70 different nationalities 🌐 with Diversity, Equality and Inclusion (DEI) in focus 🤝
  • A highly modern and fun working environment with sublime culture across the organization, follow us on Instagram @cognitedata 📷 to know more
  • Flat structure with direct access to decision-makers, with minimal amount of bureaucracy
  • Opportunity to work with and learn from some of the best people on some of the most ambitious projects found anywhere, across industries
  • Join our HUB 🗣️ to be part of the conversation directly with Cogniters and our partners.

Why choose Cognite? 🏆 🚀

Join us in making a real and lasting impact in one of the most exciting and fastest-growing new software companies in the world. We have repeatedly demonstrated that digital transformation, when anchored on strong DataOps, drives business value and sustainability for clients and allows front-line workers, as well as domain experts, to make better decisions every single day. We were recognized as one of CNBC's top global enterprise technology startups powering digital transformation! And just recently, Frost & Sullivan named Cognite a Technology Innovation Leader! 🥇 Most recently Cognite Data Fusion® Achieved Industry First DNV Compliance for Digital Twins 🥇

Apply today!

If you're excited about the opportunity to work at Cognite and make a difference in the tech industry, we encourage you to apply today! We welcome candidates of all backgrounds and identities to join our team. Please do not hesitate to contact our Talent Acquisition team with any questions -

We encourage you to follow us on Cognite LinkedIn; we post all our openings there.

Equal Opportunity

Cognite is committed to creating a diverse and inclusive environment at work and is proud to be an equal opportunity employer. All qualified applicants will receive the same level of consideration for employment; everyone we hire will receive the same level of consideration for training, compensation, and promotion.

We ask for gender as part of our application because we want to ensure equal assessment in the recruitment process. Your answer will help us reach this commitment! However, the question about gender is optional and your choice not to answer will not affect the assessment of your application in any way.

We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.

Set alerts for more jobs like Atlas AI Training Data Curation Internship: Fueling Knowledge Graph Intelligence
Set alerts for new jobs by Cognite
Set alerts for new Research Development jobs in Norway
Set alerts for new jobs in Norway
Set alerts for Research Development (Remote) jobs

Contact Us
hello@outscal.com
Made in INDIA 💛💙