Abstract
As a fundamental problem of natural language processing (NLP), the calculation of semantic text similarity plays a crucial role in a variety of big data application situations. In the process of text similarity modeling, however, owing to the complexity and ambiguity of Chinese semantics, effectively capturing the semantic interaction characteristics of Chinese text only from a single angle is impossible. This study proposes a deep learning-based computational model for semantic text similarity called SRU-based multi-angle enhanced network (SMAEN). Specifically, the authors firstly combine character-grained embeddings and word-granularity embeddings obtained from the pre-trained model to represent text. The text is encoded using a bidirectional simple recurrent unit (Bi-SRU) network, and the local text similarity is represented using a soft-aligned attention technique. In addition, the authors integrate Bi-SRU with an improved convolutional neural network (CNN) for global similarity modeling to capture semantic, time, and spatial characteristics of short text interaction. Finally, they employ a pooling layer to aggregate the calculation results into a fixed-length vector and a multi-layer perceptual (MLP) classifier to make a determination. Experimental results on Chinese public datasets LCQMC and PAWS-X show that the proposed method fully captures semantic interaction features from multiple angles and achieves advanced performance. This method can produce better matching results and enhance the accuracy of large data analysis. It is applicable to numerous scenarios involving large data, such as information retrieval and recommendation systems.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal of Information Technologies and Systems Approach
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.