A BERT based Chinese Named Entity Recognition method on ASEAN News

Haoyu Zhuang,Fu Wang,Yongzhong Huang,Songlin Bo

doi:10.1088/1742-6596/1848/1/012101

Haoyu Zhuang, Fu Wang + Show 2 more

Open Access

https://doi.org/10.1088/1742-6596/1848/1/012101

Copy DOI

Abstract

As the first step of building a knowledge graph to record the ASEAN counties’ information, we aim to conduct Named-entity Recognition (NER) on the Chinese news about ASEAN counties. We employ a Bi-directional gated recurrent unit to replace the LSTM architecture to improve both models’ effectiveness and capability in understanding polysemous words. The state-of-the-art word embedding model, BERT, has also been included to generate qualified word vectors for the NER task. Besides, we also propose a similarity-based dataset partition method to help model learning the polysemy within the Chinese news. Experiments have been done to demonstrate that the combination of such improvements can benefit the models’ performance in identifying different types of named entities.

Full Text