Abstract

AbstractThe text matching is a basic task of NLP and is important for tasks such as text retrieval, question answering, and so forth. The development of pre‐trained language models has promoted the progress of text matching tasks. However, due to the natural particularity of Chinese characters and expressions, the Chinese text matching tasks still have problems such as word segmentation difficulty, serious semantic loss, and model instability. In this paper, we propose the DAINet model, which includes DMM, AA and IO modules. We use the Dynamic Multi‐Mask module (DMM) to enhance the completeness of word segmentation. Then we use the Augmented Adversarial module (AA) to further extraction of semantic information. Finally, we use the Integrated Output module (IO) for a more stable output. We conducted experiments on LCQMC, BQ and Xiaobu datasets and compared the results with seven strong baseline models. The results showed that DAINet model made great improvement, including improving ACC value of BQ dataset to , AUC value to , ACC value of LCQMC dataset to and AUC value to . The ACC value of Xiaobu dataset was improved to and the AUC value was improved to . Further ablation experiment results show that the proposed DMM, AA and IO modules have good adaptability and improvement over existing models.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call