Abstract

Word Representations such as word embeddings have been shown to significantly improve (semi-)supervised NER for the English language. In this work we investigate whether word representations can also boost (semi-)supervised NER in Spanish. To do so, we use word representations as additional features in a linear chain Conditional Random Field (CRF) classifier. Experimental results (82.44 Fscore on the CoNLL-2002 corpus) show that our approach is comparable to some state-of-the-art Deep Learning approaches for Spanish, in particular when using

Highlights

  • Supervised NER models require large amounts of labeled data to achieve good performance, data that often is hard to acquire or generate

  • Our work focuses on using word representations as features for supervised NER for Spanish

  • The first results for supervised Spanish NER using the CoNLL 2002 corpus considered a set of features with gazetteers and external knowledge Carreras et al (2002) which turned out 81.39% F1-score

Read more

Summary

Introduction

Supervised NER models require large amounts of (manually) labeled data to achieve good performance, data that often is hard to acquire or generate. It is possible to take advantage of unlabeled data to learn word representations to enrich and boost supervised NER models learned over small gold standards. For English NER, (Passos et al, 2014; Guo et al, 2014) show that (large) word embeddings yield better results than clustering. In order to do so, we follow Guo et al (2014)’s approach combining probabilistic graphical models learned from the CoNLL 2002 corpus, with word representations learned from large unlabeled Spanish corpora, while exploring the optimal setting and feature combinations that match state-of-the-art algorithms for NER in Spanish.

Spanish NER
Word Representations
Word Representations for Spanish NER
Experiments and Discussion
NER Model
Baseline Features
Results
Discussion
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call