Abstract
Bidirectional long short-term memory (Bi-LSTM), as one of the effective networks for sequence labeling tasks, is widely used in named entity recognition (NER). However, the sequential nature of Bi-LSTM and the inability to recognize multiple sentences at the same time make it impossible to obtain overall information. In this paper, to make up for the shortcomings of Bi-LSTM in extracting global information, we propose a hierarchical context model embedded with sentence-level and document-level feature extraction. In sentence-level feature extraction, we use the self-attention mechanism to extract sentence-level representations considering the different contribution of each word to the sentence. For document-level feature extraction, 3D convolutional neural network (CNN), which not only can extract features within sentences, but also pays attention to the sequential relationship between sentences, is used to extract document-level representations. Furthermore, we investigate a layer-by-layer residual (LBL Residual) structure to optimize each Bi-LSTM block of our model, which can solve the degradation problem that appears as the number of model layers increases. Experiments show that our model achieves results competitive with the state-of-the-art records on the CONLL-2003 and Ontonotes5.0 English datasets respectively.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.