Abstract
In recent years, people’s interest in health question and answer (Q&A) websites has been growing with the development of the internet technologies. How to seek appropriate professional medical information among the massive data has become the focus of all patients. Therefore, it is vital to obtain reasonable predictions and automatic recommendations on the basis of patients’ keyword descriptions of their health status and question intention. The key to solving this problem is to achieve automatic text classification of health questions. This paper considered a feature fusion model for the classification of Chinese short texts on medical health Q&A websites by combining the text features and topic features. Firstly, we generated the text word vector by word embedding method and obtained the text features under Long Short-Term Memory (LSTM) model. Given the difficulty in determination of topic numbers, we conducted a sub-sample experiment to obtain the few optimal topic numbers under which the classification performances were good. Then we extracted the topic features and used the one-dimensional convolution idea of the Convolutional Neural Network (CNN) model for topic feature filtering. Finally, we combined the two features together subtly for text classification. Two experiments were conducted to illustrate our model in terms of recall rate, precision, and F1 value when the datasets were from different online medical Q&A websites. Results showed that the LSTM&Topic-CNN model could efficiently enhance the classification effect of Chinese medical health question texts.
Highlights
With the rapid growth of intelligent medical field, many online medical question and answer (Q&A) websites have sprung up in recent years [1]
Mao et al.: Long Short-Term Memory (LSTM)&Topic-Convolutional Neural Network (CNN) Model for Classification of Online Chinese Medical Questions analysis for text classification, but few have considered it in the terms of feature fusion
EXPERIMENTS we evaluated the performances of the LSTM&Topic-CNN model based on two real datasets from different online medial Q&A websites
Summary
With the rapid growth of intelligent medical field, many online medical question and answer (Q&A) websites have sprung up in recent years [1]. S. Mao et al.: LSTM&Topic-CNN Model for Classification of Online Chinese Medical Questions analysis for text classification, but few have considered it in the terms of feature fusion. We establish an LSTM&Topic-CNN model based on the fusion of text semantic features and topic features and apply it to the classification of Chinese medical text.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.