Bridging the Native Language and Language Variety Identification Tasks

Marc Franco-Salvador,Greg Kondrak,Paolo Rosso

doi:10.1016/j.procs.2017.08.068

Bridging the Native Language and Language Variety Identification Tasks

Marc Franco-Salvador, Greg Kondrak + Show 1 more

Open Access

https://doi.org/10.1016/j.procs.2017.08.068

Copy DOI

Journal: Procedia Computer Science	Publication Date: Jan 1, 2017
Citations: 9	License type: cc-by-nc-nd

Affiliation: Universitat Politècnica de València, GfK (Germany), University of Alberta

#Native Language Identification #Task-specific Adaptations + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

The objective of Native Language Identification is to determine the native language of the author of a text that he or she wrote in another language. By contrast, Language Variety Identification aims at classifying texts representing different varieties of a single language. We postulate that both tasks may be reduced to a single objective, which is to identify the language variety of the text. We design a general approach that combines string kernels and word embeddings, which capture different characteristics of texts. The results of our experiments show that the approach achieves excellent results on both tasks, without any task-specific adaptations.

Full Text