A generalized vector space model for text retrieval based on semantic relatedness

George Tsatsaronis, Vicky Panagiotopoulou

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

59 Scopus citations

Abstract

Generalized Vector Space Models (GVSM) extend the standard Vector Space Model (VSM) by embedding additional types of information, besides terms, in the representation of documents. An interesting type of information that can be used in such models is semantic information from word thesauri like WordNet. Previous attempts to construct GVSM reported contradicting results. The most challenging problem is to incorporate the semantic information in a theoretically sound and rigorous manner and to modify the standard interpretation of the VSM. In this paper we present a new GVSM model that exploits WordNet's semantic information. The model is based on a new measure of semantic relatedness between terms. Experimental study conducted in three TREC collections reveals that semantic information can boost text retrieval performance with the use of the proposed GVSM.

Original languageEnglish
Title of host publicationStudent Research Workshop, Demonstrations, Tutorial Abstracts
PublisherAssociation for Computational Linguistics (ACL)
Pages70-78
Number of pages9
ISBN (Print)9781932432169
DOIs
StatePublished - 2009
Externally publishedYes
Event12th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2009 - Athens, Greece
Duration: Mar 30 2009Apr 3 2009

Publication series

NameEACL 2009 - 12th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings

Conference

Conference12th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2009
Country/TerritoryGreece
CityAthens
Period03/30/0904/3/09

Fingerprint

Dive into the research topics of 'A generalized vector space model for text retrieval based on semantic relatedness'. Together they form a unique fingerprint.

Cite this