Research Scientist - Summer Intern 2026

S&P Global

Job Summary

Kensho, S&P Global's AI innovation hub, develops and deploys cutting-edge solutions in machine learning, natural language processing, and data discovery, focusing on generative AI applications for business and finance. The R&D team conducts pure NLP research, aiming for top-tier publications. As a Research Scientist Intern, you will contribute to a high-impact project, conduct and publish original research in core NLP areas like tokenization and evaluations, and develop novel state-of-the-art deep learning work, collaborating with experienced scientists and engineers.

Must Have

  • Currently enrolled in a PhD or Master’s program (Computer Science, Linguistics, or related technical field)
  • Expectation of returning to school after completion of the internship
  • Published in top NLP/ML conferences (e.g., ACL, NAACL, EMNLP, NeurIPS, COLM, ICML), ideally as a first-author
  • Relevant work experience (e.g., via internships, full-time, or at a lab)
  • Fluency in PyTorch
  • Required to work out of the Cambridge MA HQ or New York City office

Job Description

Kensho is S&P Global’s hub for AI innovation and transformation. With expertise in machine learning, natural language processing, and data discovery, we develop and deploy novel solutions to innovate and drive progress at S&P Global and its customers worldwide. Kensho's solutions and research focus on business and financial generative AI applications, agents, data retrieval APIs, data extraction, and much more.

At Kensho, we hire talented people and give them the autonomy and support needed to build amazing technology and products. We collaborate using our teammates' diverse perspectives to solve hard problems. Our communication with one another is open, honest, and efficient. We dedicate time and resources to explore new ideas, but always rooted in engineering best practices. As a result, we can innovate rapidly to produce technology that is scalable, robust, and useful.

Are you looking to solve hard problems and enjoy working with teammates with diverse perspectives? If so, we would love to help you excel here at Kensho. We are a collaborative group of experienced Machine Learning Engineers, Applied Scientists, and Research Scientists, whose academic backgrounds include doctorate degrees in NLP, theoretical physics, statistics, etc. We take pride in our team-based, tightly-knit startup Kenshin community, which fosters continuous learning and a communicative environment – allowing us to tackle the biggest challenges in data.

We value in-person collaboration; therefore, interns are required to work out of the Cambridge MA HQ or our New York City office!

About the R&D Team:

Kensho’s R&D lab was created from scratch in 2022 with the mission of bringing cutting-edge innovation to Kensho and S&P Global at large. Our small, but growing, 7-person team conducts pure research – primarily in NLP – with topics including:

  • tokenization [3][4][8][9][10]
  • building challenging evaluation benchmarks [5][6][11][14]
  • long-form QA and reasoning [5][6]
  • document inconsistencies [15]
  • post-training techniques [12]
  • and more [1][2][7][13]

We prioritize publishing in ACL, NAACL, COLM, EMNLP, NeurIPs, ICML, etc. Related, we host a thriving Reading Group every week, where folks all throughout Kensho’s ML Org volunteer and regularly attend.

We maintain close ties with academia (e.g., we collaborate on research w/ several universities and our Head of R&D teaches NLP and ML at MIT). Since our inception, we have regularly presented our work to senior members (e.g., CEO, CFO) of S&P Global and in turn have had our research and models mentioned in S&P Global’s Quarterly Earnings Reports. Our early research with LLMs helped secure resources for several new, large Generative AI projects throughout Kensho.

About the Role

As a research scientist intern, you will be a core member of our lab and will work on a high-impact project that best aligns with your experience and research interests. You will be paired with a full-time Research Scientist mentor and will have the opportunity to present your work to all of Kensho.

You will actively conduct and publish research that concerns one of the following core NLP areas:

  • Tokenization
  • Evaluations
  • Other – based on shared interests and priorities

What You’ll Do:

  • Move the needle on unsolved problems in NLP/ML by conducting original research – with the goal of publishing your work in a top conference
  • Develop novel state-of-the-art work within deep learning (e.g., algorithms, models, datasets, analyses).
  • Collaborate with other research scientists, engineering leaders, and product managers
  • Contribute to a stellar engineering culture that values simplicity and function rooted in excellent design, documentation, testing, and code
  • Write clean, readable research code in PyTorch (not expected to write production-level code)

What We Look For:

  • Currently enrolled in a PhD or Master’s program (e.g., Computer Science, Linguistics, or a related technical field), with the expectation of returning to school after completion of the internship.
  • Having published in top NLP/ML conferences (e.g., ACL, NAACL, EMNLP, NeurIPS, COLM, ICML), ideally as a first-author.
  • Relevant work experience (e.g., via internships, full-time, or at a lab).
  • Fluency in PyTorch.

[1] A Graphical Approach to Document Layout Analysis. ICDAR 2023.

[2] An Analysis of Multilingual FActScore. EMNLP 2024.

[3] Tokenization is More Than Compression. EMNLP 2024.

[4] Greed is All You Need: An Evaluation of Tokenizer Inference Methods. ACL 2024.

[5] BizBench: A Quantitative Reasoning Benchmark for Business and Finance. ACL 2024.

[6] DocFinQA: A Long-Context Financial Reasoning Dataset. ACL 2024.

[7] Language Model Probabilities are Not Calibrated in Numeric Contexts. ACL 2025.

[8] Entropy-Driven Pre-tokenization for Byte Pair Encoding. ICML Workshop 2025.

[9] How Much is Enough? The Diminishing Returns of Tokenization Training Data. ICML Workshop 2025.

[10] Boundless Byte Encoding: Breaking the Pre-Tokenization Barrier. COLM 2025.

[11] SEC-QA: A Systematic Evaluation Corpus for Financial QA. EMNLP Workshop 2025.

[12] BLEUBERI: BLEU is a surprisingly effective reward for instruction following. NeurIPS 2025.

[13] Complexity Scaling Laws for Neural Models using Combinatorial Optimization. NeurIPS 2025.

[14] No Free Labels: Limitations of LLM-as-a-Judge without Human Grounding. Unpublished manuscript.

[15] Paper recently completed and submitted to ACL Rolling Review.

Recruitment Fraud Alert:

If you receive an email from a spglobalind.com domain or any other regionally based domains, it is a scam and should be reported to reportfraud@spglobal.com. S&P Global never requires any candidate to pay money for job applications, interviews, offer letters, “pre-employment training” or for equipment/delivery of equipment. Stay informed and protect yourself from recruitment fraud by reviewing our guidelines, fraudulent domains, and how to report suspicious activity here.

We are an equal opportunity employer that welcomes future Kenshins with all experiences and perspectives. Kensho is headquartered in Cambridge, MA, with an additional office location in New York City. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, or national origin.

9 Skills Required For This Role

Excel Unity Talent Acquisition Game Texts Quality Control Pytorch Deep Learning Algorithms Machine Learning

Similar Jobs