CapsTM: capsule network for Chinese medical text matching

Xiaoming Yu,Buzhou Tang,Xiaolong Wang,Qingcai Chen,Yuan Ni,Xiaowei Huang,Yedan Shen

doi:10.1186/s12911-021-01442-9

Abstract

BackgroundText Matching (TM) is a fundamental task of natural language processing widely used in many application systems such as information retrieval, automatic question answering, machine translation, dialogue system, reading comprehension, etc. In recent years, a large number of deep learning neural networks have been applied to TM, and have refreshed benchmarks of TM repeatedly. Among the deep learning neural networks, convolutional neural network (CNN) is one of the most popular networks, which suffers from difficulties in dealing with small samples and keeping relative structures of features. In this paper, we propose a novel deep learning architecture based on capsule network for TM, called CapsTM, where capsule network is a new type of neural network architecture proposed to address some of the short comings of CNN and shows great potential in many tasks.MethodsCapsTM is a five-layer neural network, including an input layer, a representation layer, an aggregation layer, a capsule layer and a prediction layer. In CapsTM, two pieces of text are first individually converted into sequences of embeddings and are further transformed by a highway network in the input layer. Then, Bidirectional Long Short-Term Memory (BiLSTM) is used to represent each piece of text and attention-based interaction matrix is used to represent interactive information of the two pieces of text in the representation layer. Subsequently, the two kinds of representations are fused together by BiLSTM in the aggregation layer, and are further represented with capsules (vectors) in the capsule layer. Finally, the prediction layer is a connected network used for classification. CapsTM is an extension of ESIM by adding a capsule layer before the prediction layer.ResultsWe construct a corpus of Chinese medical question matching, which contains 36,360 question pairs. This corpus is randomly split into three parts: a training set of 32,360 question pairs, a development set of 2000 question pairs and a test set of 2000 question pairs. On this corpus, we conduct a series of experiments to evaluate the proposed CapsTM and compare it with other state-of-the-art methods. CapsTM achieves the highest F-score of 0.8666.ConclusionThe experimental results demonstrate that CapsTM is effective for Chinese medical question matching and outperforms other state-of-the-art methods for comparison.

Highlights

Text Matching (TM) is a fundamental task of natural language processing widely used in many application systems such as information retrieval, automatic question answering, machine translation, dialogue system, reading comprehension, etc
The experimental results demonstrate that Capsule Network for Chinese Medical Text Matching (CapsTM) is effective for Chinese medical question matching and outperforms other state-of-the-art methods for comparison
We propose a novel deep learning architecture based on capsule network for TM, called CapsTM, where capsule network [8] is a new type of neural network architecture proposed to address some of the short comings of convolutional neural network (CNN)

Summary

Introduction

Text Matching (TM) is a fundamental task of natural language processing widely used in many application systems such as information retrieval, automatic question answering, machine translation, dialogue system, reading comprehension, etc. Yu et al BMC Med Inform Decis Mak 2021, 21(Suppl 2): many application systems such as information retrieval, automatic question answering, machine translation, dialogue system and reading comprehension It is usually recognized as a classification problem where the input is a pair of pieces of text and the output is a label to indicate the two pieces of text match (denoted by 1) or not (denoted by 0). A large number of deep learning neural networks, such as Enhanced Sequential Inference Model (ESIM) [1], Attention-based Convolutional Neural Network (ABCNN) [2], Bilateral Multi-Perspective Matching (BIMPM) [3], Directional Self-Attention Network (DISAN) [4], Densely-connected co-attentive Recurrent Neural Network (DRCN) [5], Decomposable Attention Model (DECOMP) [6] and Bidirectional Encoder Representations from Transformers (BERT) [7], have been proposed for TM, and have achieved state-of-the-art performance on lots of benchmark datasets. Experiments conducted on a manually annotated corpus regarding Chinese question matching show that CapsTM outperforms six state-of-the-art neural networks, that is, ESIM [1], ABCNN [2], BIMPM [3], DISAN [4], DRCN [5], DECOMP [6] and BERT [7]

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Medical Informatics and Decision Making	Publication Date: Jul 1, 2021
Citations: 3	License type: open-access

R Discovery Prime

R Discovery Prime

CapsTM: capsule network for Chinese medical text matching

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Medical Informatics and Decision Making

Lead the way for us

Similar Papers

Adversarial Capsule Networks for Romanian Satire Detection and Sentiment Analysis
Sebastian-Vasile Echim ... Andrei-Marius Avram
-
Sebastian-Vasile Echim, et. al.Sebastian-Vasile Echim ... Andrei-Marius Avram
01 Jan 2023
01 Jan 2023

INTELLIGENT MODEL FOR CLASSIFYING HEMODYNAMIC PATTERNS OF BRAIN ACTIVATION TO IDENTIFY NEUROCOGNITIVE MECHANISMS OF SPATIAL-NUMERICAL ASSOCIATIONS
R G Asadullaev ... M A Sitnikova
Vestnik komp'iuternykh i informatsionnykh tekhnologii | VOL. -
R G Asadullaev, et. al.R G Asadullaev ... M A Sitnikova
01 Jan 2024
Vestnik komp'iuternykh i informatsionnykh tekhnologii | VOL. -

Character gated recurrent neural networks for Arabic sentiment analysis
Eslam Omara ... Mervat Mousa
Scientific Reports | VOL. 12
Eslam Omara, et. al.Eslam Omara ... Mervat Mousa
13 Jun 2022
Scientific Reports | VOL. 12

Research on improved convolutional wavelet neural network
Jingwei Liu ... Jiaxin Li
Scientific Reports | VOL. 11
Jingwei Liu, et. al.Jingwei Liu ... Jiaxin Li
09 Sep 2021
Scientific Reports | VOL. 11

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

CapsTM: capsule network for Chinese medical text matching

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Medical Informatics and Decision Making