Extracting, Detecting, and Generating Research Questions for Scientific Articles

Sina Taslimi, Artemis Capari, Hosein Azarbonyad, Zi Long Zhu, Zubair Afzal, Evangelos Kanoulas, George Tsatsaronis

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The volume of academic articles is increasing rapidly, reflecting the growing emphasis on research and scholarship across different science disciplines. This rapid growth necessitates the development of tools for more efficient and rapid understanding of these articles. Clear and well-defined Research Questions (RQs) in research articles can help guide scholarly inquiries. However, many academic studies lack a proper definition of RQs in their articles. This research addresses this gap by presenting a comprehensive framework for the systematic extraction, detection, and generation of RQs from scientific articles. The extraction component uses a set of regular expressions to identify articles containing well-defined RQs. The detection component aims to identify more complex RQs in articles, beyond those captured by the rule-based extraction method. The RQ generation focuses on creating RQs for articles that lack them. We integrate all these components to build a pipeline to extract RQs or generate them based on the articles' full text. We evaluate the performance of the designed pipeline on a set of metrics designed to assess the quality of RQs. Our results indicate that the proposed pipeline can reliably detect RQs and generate high-quality ones.

Original languageEnglish
Title of host publicationMain Conference
EditorsOwen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
PublisherAssociation for Computational Linguistics (ACL)
Pages8573-8588
Number of pages16
ISBN (Electronic)9798891761964
StatePublished - 2025
Externally publishedYes
Event31st International Conference on Computational Linguistics, COLING 2025 - Abu Dhabi, United Arab Emirates
Duration: Jan 19 2025Jan 24 2025

Publication series

NameProceedings - International Conference on Computational Linguistics, COLING
VolumePart F206484-1
ISSN (Print)2951-2093

Conference

Conference31st International Conference on Computational Linguistics, COLING 2025
Country/TerritoryUnited Arab Emirates
CityAbu Dhabi
Period01/19/2501/24/25

Fingerprint

Dive into the research topics of 'Extracting, Detecting, and Generating Research Questions for Scientific Articles'. Together they form a unique fingerprint.

Cite this