VLSP Shared Task: Named Entity Recognition

Huyen T M Nguyen,Quyen T Ngo,Vu M Tran,Hien T T Nguyen,Luong X Vu

doi:10.15625/1813-9663/34/4/13161

Abstract

Named entities (NE) are phrases that contain the names of persons, organizations, locations, times and quantities, monetary values, percentages, etc. Named Entity Recognition (NER) is the task of recognizing named entities in documents. NER is an important subtask of Information Extraction, which has attracted researchers all over the world since 1990s. For Vietnamese language, although there exists some research projects and publications on NER task before 2016, no systematic comparison of the performance of NER systems has been done. In 2016, the organizing committee of the VLSP workshop decided to launch the first NER shared task, in order to get an objective evaluation of Vietnamese NER systems and to promote the development of high quality systems. As a result, the first dataset with morpho-syntactic and NE annotations has been released for benchmarking NER systems. At VLSP 2018, the NER shared task has been organized for the second time, providing a bigger dataset containing texts from various domains, but without morpho-syntactic annotation. These resources are available for research purpose via the VLSP website vlsp.org.vn/resources. In this paper, we describe the datasets as well as the evaluation results obtained from these two campaigns.

Full Text