Abstract

Neural network models are popularly used in Chinese word segmentation task. The capsule architecture is proposed recently which has solved some defects of convolutional neural network. In this paper, we first introduce the capsule architecture to Chinese word segmentation. We utilize capsules as neural units. Before doing routing algorithm, we make a sliding capsule window to select the features which are extracted from the primary capsule layer. The sliding capsule window is proposed to adapt the capsule architecture to the sequence labeling task. The experiment results show that our proposed capsules based Chinese word segmentation model achieves competitive performances with the previous state-of-the-art methods. Ancient Chinese medical books record a lot of valuable experiences from the ancient medical workers. However, the research about the automatic text analysis on ancient Chinese medical documents is just a beginning. Due to the lack of the annotated data for Chinese medicine, we develop the word segmentation guideline for the ancient Chinese medical documents and select 10 genres, 30 ancient Chinese medical books to set up the annotation dataset. And with the annotated data, we develop the segmenter for the ancient Chinese medical text. Experiments show that the $F_{1}$ measures of our model on the two datasets are 94.9% and 81.4% on Chinese Treebank6.0 and Ancient Chinese Medical Books, respectively.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.