Taxonomy Generation for Scientific Concepts Using Large Language Models

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Traditional data-driven automatic taxonomy generation methods struggle with complex, large, and domain-specific datasets. To address these issues, this study leverages Large Language Models (LLMs) to automate key stages of taxonomy generation, focusing on scientific concepts. Our approach employs LLMs at several stages of the taxonomy generation process, including extracting candidate concepts and organizing keywords into taxonomies centered around chosen scientific concepts. By incorporating LLMs, we aim to enhance depth, accuracy, and coherence of generated taxonomies. Comparative analyses show that the proposed LLM-based taxonomy generation method outperforms state-of-the-art taxonomy generation methods on several metrics, such as concept coherence and coverage. Using a hybrid evaluation framework that combines automatic and human assessments, we demonstrate that our LLM-based solution is scalable, adaptable, and capable of generating high-quality taxonomies tailored to specific scientific concepts.

Original languageEnglish
Title of host publicationExperimental IR Meets Multilinguality, Multimodality, and Interaction - 16th International Conference of the CLEF Association, CLEF 2025, Proceedings
EditorsJorge Carrillo-de-Albornoz, Alba García Seco de Herrera, Julio Gonzalo, Laura Plaza, Josiane Mothe, Florina Piroi, Paolo Rosso, Damiano Spina, Guglielmo Faggioli, Nicola Ferro
PublisherSpringer Science and Business Media Deutschland GmbH
Pages74-86
Number of pages13
ISBN (Print)9783032043535
DOIs
StatePublished - 2026
Externally publishedYes
Event16th International Conference of the Cross-Language Evaluation Forum for European Languages, CLEF 2025 - Madrid, Spain
Duration: Sep 9 2025Sep 12 2025

Publication series

NameLecture Notes in Computer Science
Volume16089 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference16th International Conference of the Cross-Language Evaluation Forum for European Languages, CLEF 2025
Country/TerritorySpain
CityMadrid
Period09/9/2509/12/25

Keywords

  • Automatic Taxonomy Construction
  • LLMs for Taxonomy Construction
  • Scientific Document Processing

Fingerprint

Dive into the research topics of 'Taxonomy Generation for Scientific Concepts Using Large Language Models'. Together they form a unique fingerprint.

Cite this