Semantic Similarity Computing Model Based on Multi Model Fine-Grained Nonlinear Fusion

Peiying Zhang,Haifeng Wang,Xingzhe Huang,Shuqing He,Yaqi Wang,Chunxiao Jiang

doi:10.1109/access.2021.3049378

Abstract

Natural language processing (NLP) task has achieved excellent performance in many fields, including semantic understanding, automatic summarization, image recognition and so on. However, most of the neural network models for NLP extract the text in a fine-grained way, which is not conducive to grasp the meaning of the text from a global perspective. To alleviate the problem, the combination of the traditional statistical method and deep learning model as well as a novel model based on multi model nonlinear fusion are proposed in this paper. The model uses the Jaccard coefficient based on part of speech, Term Frequency-Inverse Document Frequency (TF-IDF) and word2vec-CNN algorithm to measure the similarity of sentences respectively. According to the calculation accuracy of each model, the normalized weight coefficient is obtained and the calculation results are compared. The weighted vector is input into the fully connected neural network to give the final classification results. As a result, the statistical sentence similarity evaluation algorithm reduces the granularity of feature extraction, so it can grasp the sentence features globally. Experimental results show that the matching of sentence similarity calculation method based on multi model nonlinear fusion is 84%, and the F1 value of the model is 75%.

Highlights

It is undeniable that feature extraction techniques have been widely used in many fields, most of which are based on deep learning, including image processing, Natural language processing (NLP) and so on
WORK In this paper, a multi model nonlinear fusion algorithm is proposed for different sentence structure features
The improved Jaccard algorithm takes the grammatical information into account in the calculation process of similarity, so that the single feature based on the number of co-occurrence words is supplemented

Summary

INTRODUCTION

It is undeniable that feature extraction techniques have been widely used in many fields, most of which are based on deep learning, including image processing, NLP and so on. Different from image processing, the basic semantic unit of NLP [1]–[8] is sememe, it has such characters as independent, decentralized, diversification These features determine that the model needs to grasp the meaning of the text from the coarse-grained aspect. Pinheiro et al [11] propose a sentence similarity calculation model based on the fusion of deep learning model and statistical method. The model combines the traditional sentence similarity calculation method based on statistics, and completes the coarse-grained extraction of sentence. This calculation method realizes the overall grasp of text features from the coarse-grained aspect. Compared with the direct extraction of sentence feature matrix, this weighting mechanism can highlight the key points of extraction

RELATED WORKS

THREE SENTENCE SIMILARITY COMPUTING MODELS

EXPERIMENT AND RESULT ANALYSIS

Findings

CONCLUSION AND FUTURE WORK

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE access : practical innovations, open solutions	Publication Date: Jan 1, 2021
Citations: 40	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Semantic Similarity Computing Model Based on Multi Model Fine-Grained Nonlinear Fusion

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE access : practical innovations, open solutions

Lead the way for us

Similar Papers

Automatic Summarization of Malayalam Documents using Text Extraction Methods
Jisha P Jayan ... Govindaru V
-
Jisha P Jayan, et. al.Jisha P Jayan ... Govindaru V
01 Jan 2020
01 Jan 2020

Distributed Representation of Words in Vector Space for Kannada Language
Pandurang S Kambali ... Sanjana Suri
-
Pandurang S Kambali, et. al.Pandurang S Kambali ... Sanjana Suri
01 Dec 2018
01 Dec 2018

Performance Evaluation of Deep Learning Algorithms in Biomedical Document Classification
Bichitrananda Behera ... Prem Kumar B
-
Bichitrananda Behera, et. al.Bichitrananda Behera ... Prem Kumar B
01 Dec 2019
01 Dec 2019

Hybrid Attention-based Approach for Arabic Paraphrase Detection
Adnen Mahmoud ... Mounir Zrigui
Applied Artificial Intelligence | VOL. ahead-of-print
Adnen Mahmoud, et. al.Adnen Mahmoud ... Mounir Zrigui
05 Sep 2021
Applied Artificial Intelligence | VOL. ahead-of-print

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Semantic Similarity Computing Model Based on Multi Model Fine-Grained Nonlinear Fusion

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE access : practical innovations, open solutions