TY - JOUR
T1 - An overview of the BioASQ large-scale biomedical semantic indexing and question answering competition
AU - Tsatsaronis, George
AU - Balikas, Georgios
AU - Malakasiotis, Prodromos
AU - Partalas, Ioannis
AU - Zschunke, Matthias
AU - Alvers, Michael R.
AU - Weissenborn, Dirk
AU - Krithara, Anastasia
AU - Petridis, Sergios
AU - Polychronopoulos, Dimitris
AU - Almirantis, Yannis
AU - Pavlopoulos, John
AU - Baskiotis, Nicolas
AU - Gallinari, Patrick
AU - Artiéres, Thierry
AU - Ngomo, Axel Cyrille Ngonga
AU - Heino, Norman
AU - Gaussier, Eric
AU - Barrio-Alvers, Liliana
AU - Schroeder, Michael
AU - Androutsopoulos, Ion
AU - Paliouras, Georgios
N1 - Publisher Copyright:
© 2015 Tsatsaronis et al.
PY - 2015/4/30
Y1 - 2015/4/30
N2 - Background: This article provides an overview of the first BioASQ challenge, a competition on large-scale biomedical semantic indexing and question answering (QA), which took place between March and September 2013. BioASQ assesses the ability of systems to semantically index very large numbers of biomedical scientific articles, and to return concise and user-understandable answers to given natural language questions by combining information from biomedical articles and ontologies. Results: The 2013 BioASQ competition comprised two tasks, Task 1a and Task 1b. In Task 1a participants were asked to automatically annotate new PubMed documents with MeSH headings. Twelve teams participated in Task 1a, with a total of 46 system runs submitted, and one of the teams performing consistently better than the MTI indexer used by NLM to suggest MeSH headings to curators. Task 1b used benchmark datasets containing 29 development and 282 test English questions, along with gold standard (reference) answers, prepared by a team of biomedical experts from around Europe and participants had to automatically produce answers. Three teams participated in Task 1b, with 11 system runs. The BioASQ infrastructure, including benchmark datasets, evaluation mechanisms, and the results of the participants and baseline methods, is publicly available. Conclusions: A publicly available evaluation infrastructure for biomedical semantic indexing and QA has been developed, which includes benchmark datasets, and can be used to evaluate systems that: assign MeSH headings to published articles or to English questions; retrieve relevant RDF triples from ontologies, relevant articles and snippets from PubMed Central; produce "exact" and paragraph-sized "ideal" answers (summaries). The results of the systems that participated in the 2013 BioASQ competition are promising. In Task 1a one of the systems performed consistently better from the NLM's MTI indexer. In Task 1b the systems received high scores in the manual evaluation of the "ideal" answers; hence, they produced high quality summaries as answers. Overall, BioASQ helped obtain a unified view of how techniques from text classification, semantic indexing, document and passage retrieval, question answering, and text summarization can be combined to allow biomedical experts to obtain concise, user-understandable answers to questions reflecting their real information needs.
AB - Background: This article provides an overview of the first BioASQ challenge, a competition on large-scale biomedical semantic indexing and question answering (QA), which took place between March and September 2013. BioASQ assesses the ability of systems to semantically index very large numbers of biomedical scientific articles, and to return concise and user-understandable answers to given natural language questions by combining information from biomedical articles and ontologies. Results: The 2013 BioASQ competition comprised two tasks, Task 1a and Task 1b. In Task 1a participants were asked to automatically annotate new PubMed documents with MeSH headings. Twelve teams participated in Task 1a, with a total of 46 system runs submitted, and one of the teams performing consistently better than the MTI indexer used by NLM to suggest MeSH headings to curators. Task 1b used benchmark datasets containing 29 development and 282 test English questions, along with gold standard (reference) answers, prepared by a team of biomedical experts from around Europe and participants had to automatically produce answers. Three teams participated in Task 1b, with 11 system runs. The BioASQ infrastructure, including benchmark datasets, evaluation mechanisms, and the results of the participants and baseline methods, is publicly available. Conclusions: A publicly available evaluation infrastructure for biomedical semantic indexing and QA has been developed, which includes benchmark datasets, and can be used to evaluate systems that: assign MeSH headings to published articles or to English questions; retrieve relevant RDF triples from ontologies, relevant articles and snippets from PubMed Central; produce "exact" and paragraph-sized "ideal" answers (summaries). The results of the systems that participated in the 2013 BioASQ competition are promising. In Task 1a one of the systems performed consistently better from the NLM's MTI indexer. In Task 1b the systems received high scores in the manual evaluation of the "ideal" answers; hence, they produced high quality summaries as answers. Overall, BioASQ helped obtain a unified view of how techniques from text classification, semantic indexing, document and passage retrieval, question answering, and text summarization can be combined to allow biomedical experts to obtain concise, user-understandable answers to questions reflecting their real information needs.
KW - BioASQ Competition
KW - Hierarchical Text Classification
KW - Information retrieval
KW - Multi-document text summarization
KW - Passage retrieval
KW - Question answering
KW - Semantic indexing
UR - http://www.scopus.com/inward/record.url?scp=84929625248&partnerID=8YFLogxK
U2 - 10.1186/s12859-015-0564-6
DO - 10.1186/s12859-015-0564-6
M3 - Artículo
C2 - 25925131
AN - SCOPUS:84929625248
SN - 1471-2105
VL - 16
JO - BMC Bioinformatics
JF - BMC Bioinformatics
IS - 1
M1 - 138
ER -