TY - GEN
T1 - Generating Topic Pages for Scientific Concepts Using Scientific Publications
AU - Azarbonyad, Hosein
AU - Afzal, Zubair
AU - Tsatsaronis, George
N1 - Publisher Copyright:
© 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2023
Y1 - 2023
N2 - In this paper, we describe Topic Pages, an inventory of scientific concepts and information around them extracted from a large collection of scientific books and journals. The main aim of Topic Pages is to provide all the necessary information to the readers to understand scientific concepts they come across while reading scholarly content in any scientific domain. Topic Pages are a collection of automatically generated information pages using NLP and ML, each corresponding to a scientific concept. Each page contains three pieces of information: a definition, related concepts, and the most relevant snippets, all extracted from scientific peer-reviewed publications. In this paper, we discuss the details of different components to extract each of these elements. The collection of pages in production contains over 360, 000 Topic Pages across 20 different scientific domains with an average of 23 million unique visits per month, constituting it a popular source for scientific information.
AB - In this paper, we describe Topic Pages, an inventory of scientific concepts and information around them extracted from a large collection of scientific books and journals. The main aim of Topic Pages is to provide all the necessary information to the readers to understand scientific concepts they come across while reading scholarly content in any scientific domain. Topic Pages are a collection of automatically generated information pages using NLP and ML, each corresponding to a scientific concept. Each page contains three pieces of information: a definition, related concepts, and the most relevant snippets, all extracted from scientific peer-reviewed publications. In this paper, we discuss the details of different components to extract each of these elements. The collection of pages in production contains over 360, 000 Topic Pages across 20 different scientific domains with an average of 23 million unique visits per month, constituting it a popular source for scientific information.
KW - Definition extraction
KW - Multi-document summarization
KW - Scientific document processing
UR - http://www.scopus.com/inward/record.url?scp=85150991658&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-28238-6_23
DO - 10.1007/978-3-031-28238-6_23
M3 - Contribución a la conferencia
AN - SCOPUS:85150991658
SN - 9783031282379
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 341
EP - 349
BT - Advances in Information Retrieval - 45th European Conference on Information Retrieval, ECIR 2023, Proceedings
A2 - Kamps, Jaap
A2 - Goeuriot, Lorraine
A2 - Crestani, Fabio
A2 - Maistro, Maria
A2 - Joho, Hideo
A2 - Davis, Brian
A2 - Gurrin, Cathal
A2 - Caputo, Annalina
A2 - Kruschwitz, Udo
PB - Springer Science and Business Media Deutschland GmbH
T2 - 45th European Conference on Information Retrieval, ECIR 2023
Y2 - 2 April 2023 through 6 April 2023
ER -