Well-Behaved Transformer for Chinese Medical NER

Zhichang Zhang,Dan Liu,Xiaohui Qin,Yanlong Qiu

doi:10.1109/icnlp52887.2021.00033

Abstract

Medical named entity recognition (NER) is an important task of clinical natural language processing (NLP). It is a hot issue in intelligent medicine research. Recently, the proposed Lattice-LSTM model has demonstrated that incorporating information of words in character sequence into character-level Chinese NER has achieved new benchmark results on the Chinese datasets in multiple other fields. However, due to the lattice structure is dynamic and complex. These lattice-based models are difficult to fully use of the GPUs parallel computing, which limits its application. In this work, we propose a Well-Behaved Transformer (WB-Transformer) model for Chinese medical named entity recognition, using a high-performance encoding strategy to separately encode the character of Chinese electronic medical records (EMRs) and words corresponding to the character. This reduces the impact of word segmentation errors while obtaining the word boundary information, and makes full use information of characters and words for Chinese medical NER. Experimental on three Chinese medical entity recognition datasets show that our proposed model outperforms other methods.

Full Text