Abstract

Medical short text classification is of great significance to medical information extraction and medical auxiliary diagnosis. However, medical short texts face challenges such as sparse features, semantic ambiguity, and the specialized nature of the medical field, resulting in relatively low accuracy in short text classification. Taking into consideration the characteristics of medical short texts, this paper proposes a Chinese medical short text classification model based on DPECNN. First, ERNIE is utilized to learn text knowledge and information in order to enhance the model’s semantic representation capabilities. Then, the DPECNN model is employed to extract rich feature information, and the classification results are generated through a fully connected layer. In the case of DPCNN, it only considers deep-level contextual semantic information, overlooking the correlation of adjacent semantic information between channels. To address this, ECA channel attention is introduced to account for adjacent semantic information. The use of a self-normalizing activation function helps avoid the problem of vanishing gradients. To enhance the model’s robustness and generalization ability, the FGM adversarial training algorithm is employed to perturb the data. The F1 values achieved on the THUCNews, KUAKE-QIC, and CHIP-CTC datasets are 95.00%, 79.45%, and 82.81%, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call