Abstract
Most of the current word segmentation methods are rule-based and traditional machine learning methods. Universal word segmentation tools do not work well in the field such as metallurgy. Domain-specific Chinese word segmentation is rarely studied. In recent years, with the development of deep learning, the neural network has been proved to be effective in Chinese word segmentation. However, this promising performance relies on large-scale training data. Neural networks with conventional architectures cannot achieve the desired results in low-resource datasets due to the lack of labeled training data. This paper takes the field of metallurgy as an example and proposes a domain-specific Chinese word segmentation based on Bi-directional long-short term memory (Bi-directional LSTM) model in the metallurgical field. First, the word segmentation model is obtained by using the Bi-directional LSTM model to train the internal and external domain knowledge. Then, a series of tuning parameters are carried out and the label probability of the word is combined with the weight. Finally, the result of word segmentation is obtained by label inference layer. The experimental results show that the proposed method can create a better word segmentation effect in the field of metallurgy.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.