Semantic textual similarity analysis is a fundamental task in the field of natural language processing. Both representational and interactive models are extensively employed for this purpose. However, representational models often suffer from issues such as semantic defocus, semantic shift, and low accuracy. In contrast, interactive models can accurately capture the semantic focus of a text and extract ample semantic information, but hindered by high time costs and low retrieval efficiency. To address these challenges, this paper introduces a Chinese Text Similarity Analysis model based on Residual Fusion (CTSARF). This model effectively implements the semantic feature fusion of representational and interactive models through a constant mapping method using residual networks. Therefore CTSARF can achieve both high accuracy and retrieval efficiency simultaneously. In this study, we employ the cosine-based loss function CoSENT and the fine-tuned pre-trained model Mengzi to optimize performance in text similarity analysis tasks. Additionally, we utilize a data augmentation method based on text editing to enhance the robustness of CTSARF when evaluating text similarity on unbalanced datasets. We conducted optimization comparison experiments, benchmark model comparison experiments, and ablation experiments of the CTSARF model on four Chinese datasets. The results demonstrate improvements in accuracy, recall, precision, and F1 scores, thereby highlighting the necessity and superiority of the model fusion method based on residual networks.
Read full abstract