Traditional methods for identifying naming ignore the correlation between named entities and lose hierarchical structural information between the named entities in a given text. Although traditional named-entity methods are effective for conventional datasets that have simple structures, they are not as effective for sports texts. This paper proposes a Chinese sports text named-entity recognition method based on a character graph convolutional neural network (Char GCN) with a self-attention mechanism model. In this method, each Chinese character in the sports text is regarded as a node. The edge between the nodes is constructed using a similar character position and the character feature of the named-entity in the sports text. The internal structural information of the entity is extracted using a character map convolutional neural network. The hierarchical semantic information of the sports text is captured by the self-attention model to enhance the relationship between the named entities and capture the relevance and dependency between the characters. The conditional random fields classification function can accurately identify the named entities in the Chinese sports text. The results conducted on four datasets demonstrate that the proposed method improves the F-Score values significantly to 92.51%, 91.91%, 93.98%, and 95.01%, respectively, in comparison to the traditional naming methods.
Read full abstract