Exploring Unsupervised Features in Conditional Random Fields for Spanish Named Entity Recognition

Jenny Copara, Jose Ochoa, Camilo Thorne, Goran Glavas

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Unsupervised features such as word representations mostly given by word embeddings have been shown significantly improve semi supervised Named Entity Recognition (NER) for English language. In this work we investigate whether unsupervised features can boost (semi) supervised NER in Spanish. To do so, we use word representations and collocations as additional features in a linear chain Conditional Random Field (CRF) classifier. Experimental results (82.44% F-score on the CoNLL-2002 corpus and 65.72% F-score on Ancora Corpus) show that our approach is comparable to some state-of-art Deep Learning approaches for Spanish, in particular when using cross-lingual Word Representations.

Original languageEnglish
Title of host publicationProceedings - 2016 5th Brazilian Conference on Intelligent Systems, BRACIS 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages283-288
Number of pages6
ISBN (Electronic)9781509035663
DOIs
StatePublished - Feb 1 2017
Externally publishedYes
Event5th Brazilian Conference on Intelligent Systems, BRACIS 2016 - Recife, Pernambuco, Brazil
Duration: Oct 9 2016Oct 12 2016

Publication series

NameProceedings - 2016 5th Brazilian Conference on Intelligent Systems, BRACIS 2016

Conference

Conference5th Brazilian Conference on Intelligent Systems, BRACIS 2016
Country/TerritoryBrazil
CityRecife, Pernambuco
Period10/9/1610/12/16

Keywords

  • Conditional Random Fields
  • NER for Spanish
  • Unsupervised features
  • Word embeddings
  • Word Representations

Fingerprint

Dive into the research topics of 'Exploring Unsupervised Features in Conditional Random Fields for Spanish Named Entity Recognition'. Together they form a unique fingerprint.

Cite this