Abstract

Named entity recognition is a fundamental task in natural language processing, which aims to identify potential entities such as person, place, and organization in the text. Identifying names in ancient Chinese literature is helpful to discover Chinese traditional culture and promote traditional spirit. Unfortunately, there are two main problems for named entity recognition task in the field of ancient Chinese literature: (1) A scarcity of available annotated corpus has led to little research in this area. (2) Most existing work only focus on character embedding, resulting in limited performance. This is because character vector is difficult to consider the relevance of characters and words when processing Chinese texts, especially ancient Chinese texts. To tackle the above problems, we first introduce the distant supervision method to construct the required annotated dataset, and then propose a boundary detection enhanced named entity recognition model based on BERT+CRF. The proposed framework is proved to be effective through comparative experiments and achieve the best F1 value of 81.24%.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.