Towards Confidence Estimation for Typed Protein-Protein Relation Extraction

Camilo Thorne, Roman Klinger

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Systems which build on top of information extraction are typically challenged to extract knowledge that, while correct, is not yet well-known. We hypothesize that a good confidence measure for relational information has the property that such interesting information is found between information extracted with very high confidence and very low confidence. We discuss confidence estimation for the domain of biomedical protein-protein relation discovery in biomedical literature. As facts reported in papers take some time to be validated and recorded in biomedical databases, such task gives rise to large quantities of unknown but potentially true candidate relations. It is thus important to rank them based on supporting evidence rather than discard them. In this paper, we discuss this task and propose different approaches for confidence estimation and a pipeline to evaluate such methods. We show that the most straight-forward approach, a combination of different confidence measures from pipeline modules seems not to work well. We discuss this negative result and pinpoint potential future research directions.

Original languageEnglish
Title of host publicationProceedings of the Biomedical NLP Workshop, BioNLP 2017, associated with RANLP 2017
EditorsSvetla Boytcheva, Kevin Bretonnel Cohen, Guergana Savova, Galia Angelova
PublisherIncoma Ltd
Pages55-63
Number of pages9
ISBN (Electronic)9789544520441
DOIs
StatePublished - 2017
Event2017 Biomedical NLP Workshop, BioNLP 2017 - Varna, Bulgaria
Duration: Sep 8 2017 → …

Publication series

NameInternational Conference Recent Advances in Natural Language Processing, RANLP
ISSN (Print)1313-8502

Conference

Conference2017 Biomedical NLP Workshop, BioNLP 2017
Country/TerritoryBulgaria
CityVarna
Period09/8/17 → …

Fingerprint

Dive into the research topics of 'Towards Confidence Estimation for Typed Protein-Protein Relation Extraction'. Together they form a unique fingerprint.

Cite this