Let's agree to disagree: On the evaluation of vocabulary alignment

Anna Tordai, Jacco Van Ossenbruggen, Guus Schreiber, Bob Wielinga

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

15 Scopus citations

Abstract

Gold standard mappings created by experts are at the core of alignment evaluation. At the same time, the process of manual evaluation is rarely discussed. While the practice of having multiple raters evaluate results is accepted, their level of agreement is often not measured. In this paper we describe three experiments in manual evaluation and study the way different raters evaluate mappings. We used alignments generated using different techniques and between vocabularies of different type. In each experiment, five raters evaluated alignments and talked through their decisions using the think aloud method. In all three experiments we found that inter-rater agreement was low and analyzed our data to find the reasons for it. Our analysis shows which variables can be controlled to affect the level of agreement including the mapping relations, the evaluation guidelines and the background of the raters. On the other hand, differences in the perception of raters, and the complexity of the relations between often ill-defined natural language concepts remain inherent sources of disagreement. Our results indicate that the manual evaluation of ontology alignments is by no means an easy task and that the ontology alignment community should be careful in the construction and use of reference alignments.

Original languageEnglish
Title of host publicationKCAP 2011 - Proceedings of the 2011 Knowledge Capture Conference
Pages65-72
Number of pages8
DOIs
StatePublished - 2011
Externally publishedYes
Event6th International Conference on Knowledge Capture, KCAP 2011 - Banff, AB, Canada
Duration: Jun 26 2011Jun 29 2011

Publication series

NameKCAP 2011 - Proceedings of the 2011 Knowledge Capture Conference

Conference

Conference6th International Conference on Knowledge Capture, KCAP 2011
Country/TerritoryCanada
CityBanff, AB
Period06/26/1106/29/11

Keywords

  • empirical study
  • inter-rater agreement
  • manual evaluation
  • vocabulary alignment

Fingerprint

Dive into the research topics of 'Let's agree to disagree: On the evaluation of vocabulary alignment'. Together they form a unique fingerprint.

Cite this