LLM Applied Data Scientist (RAG/ NLP)

5 Minutes ago • All levels
Research Development

Job Description

Binance is seeking a highly skilled Research Scientist/Engineer to advance the reasoning and planning capabilities of large foundation models. This role involves enhancing model performance across the entire development lifecycle, including data acquisition, supervised fine-tuning (SFT), reward modelling, and reinforcement learning. You will synthesize large-scale, high-quality datasets, solve complex tasks using System 2 thinking and advanced decoding strategies, design robust evaluation methodologies, and build agents capable of addressing sophisticated real-world problems.
Good To Have:
  • Publications in top-tier conferences/journals (NeurIPS, ICML, ACL, CVPR, SIGIR, KDD, WWW).
  • Awards in ACM/ICPC or similar competitions.
Must Have:
  • Design, develop, and optimize data processing and retrieval pipelines for enterprise-level generative tasks and model training applications.
  • Research and evaluate advanced AI-native retrieval algorithms to strengthen LLM/VLM/Agentic AI capabilities.
  • Collaborate with infrastructure and application teams to integrate RAG pipelines into production systems.
  • Develop and optimize retrieval and ranking pipelines to improve user experience.
  • Participate in LLM training and RAG system, staying current with techniques such as pre-training, SFT, and reinforcement learning.
  • Apply NLP, CV, and multimodal methods to analyze user-generated content.
  • Master’s in Information Retrieval, NLP, Machine Learning, Computer Vision, Multimodal Learning, or related fields.
  • Proficient in PyTorch with strong coding skills in Python or C++.
  • Solid theoretical foundation in information retrieval, NLP, and deep learning.
  • Hands-on experience with RAG, vector databases, multimodal/graph retrieval, or large-scale AI systems.
  • Strong engineering ability to translate research into scalable, production-level systems.
  • Self-driven, able to own projects end-to-end (design → implementation → deployment).
Perks:
  • Shape the future with the world’s leading blockchain ecosystem.
  • Collaborate with world-class talent in a user-centric global organization with a flat structure.
  • Tackle unique, fast-paced projects with autonomy in an innovative environment.
  • Thrive in a results-driven workplace with opportunities for career growth and continuous learning.
  • Competitive salary and company benefits.
  • Work-from-home arrangement.

Add these skills to join the top 1% applicants for this job

team-management
communication
cpp
data-structures
game-texts
user-experience-ux
pytorch
deep-learning
computer-vision
reinforcement-learning
python
algorithms
machine-learning

About the Role

We are seeking a highly skilled Research Scientist/Engineer to advance the reasoning and planning capabilities of large foundation models. In this role, you will enhance model performance across the entire development lifecycle—including data acquisition, supervised fine-tuning (SFT), reward modelling, and reinforcement learning—while driving innovations in reasoning and decision-making. You will synthesise large-scale, high-quality datasets through rewriting, augmentation, and generation techniques to strengthen foundation models during pretraining, SFT, and RL stages. A key part of the role involves solving complex tasks using System 2 thinking and applying advanced decoding strategies such as MCTS and A*. You will design and implement robust evaluation methodologies, teach models to interact with external tools, APIs, and code interpreters, and build agents and multi-agent systems capable of addressing sophisticated real-world problems.

Responsibilities

  • Design, develop, and optimize data processing and retrieval pipelines for enterprise-level generative tasks and mode training applications (Customer Service, Token Report, Web3 Domain Models). This includes embedding, reranking, context engineering, and query rewriting models.
  • Research and evaluate advanced AI-native retrieval algorithms (e.g., low-latency, multimodal retrieval, hierarchical retrieval, GraphRAG) to strengthen large-scale LLM/VLM/Agentic AI capabilities in Binance products.
  • Collaborate with infrastructure and application teams to integrate RAG pipelines into production systems, ensuring scalability, reliability, and measurable business impact.
  • Develop and optimize retrieval and ranking pipelines (indexing, vector search, retrieval scoring, reranking) to improve user experience.
  • Participate in LLM training and RAG system, staying current with techniques such as pre-training, SFT, and reinforcement learning, and apply them to retrieval and generation tasks.
  • Apply NLP, CV, and multimodal methods to analyze user-generated content (classification, quality evaluation, trend detection, comment analysis).

Requirement

  • Master’s in Information Retrieval, NLP, Machine Learning, Computer Vision, Multimodal Learning, or related fields.
  • Proficient in PyTorch with strong coding skills in Python or C++.
  • Strong communication skills, intellectual curiosity, and passion for lifelong learning. Able to identify opportunities and drive cutting-edge retrieval & RAG technologies into real-world applications.
  • Solid theoretical foundation in information retrieval, NLP, and deep learning (experience with embeddings, reranking, query understanding preferred).
  • Hands-on experience with RAG, vector databases, multimodal/graph retrieval, or large-scale AI systems.
  • Strong engineering ability to translate research into scalable, production-level systems.
  • Self-driven, able to own projects end-to-end (design → implementation → deployment).
  • Publications in top-tier conferences/journals (NeurIPS, ICML, ACL, CVPR, SIGIR, KDD, WWW) are a plus; awards in ACM/ICPC or similar competitions preferred.

Why Binance

  • Shape the future with the world’s leading blockchain ecosystem
  • Collaborate with world-class talent in a user-centric global organization with a flat structure
  • Tackle unique, fast-paced projects with autonomy in an innovative environment
  • Thrive in a results-driven workplace with opportunities for career growth and continuous learning
  • Competitive salary and company benefits
  • Work-from-home arrangement (the arrangement may vary depending on the work nature of the business team)

Binance is committed to being an equal opportunity employer. We believe that having a diverse workforce is fundamental to our success.

By submitting a job application, you confirm that you have read and agree to our Candidate Privacy Notice._

Why Binance

  • Shape the future with the world’s leading blockchain ecosystem
  • Collaborate with world-class talent in a user-centric global organization with a flat structure
  • Tackle unique, fast-paced projects with autonomy in an innovative environment
  • Thrive in a results-driven workplace with opportunities for career growth and continuous learning
  • Competitive salary and company benefits
  • Work-from-home arrangement (the arrangement may vary depending on the work nature of the business team)

Binance is committed to being an equal opportunity employer. We believe that having a diverse workforce is fundamental to our success.

By submitting a job application, you confirm that you have read and agree to our Candidate Privacy Notice._

Set alerts for more jobs like LLM Applied Data Scientist (RAG/ NLP)
Set alerts for new jobs by binance
Set alerts for Research Development (Remote) jobs

Contact Us
hello@outscal.com
Made in INDIA 💛💙