Abstract

Sentence similarity calculation is one of the important foundations of natural language processing. The existing sentence similarity calculation measurements are based on either shallow semantics with the limitation of inadequately capturing latent semantics information or deep learning algorithms with the limitation of supervision. In this paper, we improve the traditional tolerance rough set model, with the advantages of lower time complexity and becoming incremental compared to the traditional one. And then we propose a sentence similarity computation model from the perspective of uncertainty of text data based on the probabilistic tolerance rough set model. It has the ability of mining latent semantics information and is unsupervised. Experiments on SICK2014 task and STSbenchmark dataset to calculate sentence similarity identify a significant and efficient performance of our model.

Highlights

  • With the rapid development of information technique, innumerable text data are continuously growing

  • Sentence similarity aims at calculating the degree of resemblance or distance between two sentences. It plays an important role in the application of natural language processing (NLP), like text summarization [1, 2], machine translation [3], question answering systems [4], and information retrieval [5]. ese applications are based on sentence similarity to a certain extent, whose development makes the research of sentence similarity become urgent

  • Methods based on the deep learning algorithms such as convolutional neural network (CNN) can capture deep semantics information, but most of them are with high time complexity and supervision

Read more

Summary

Introduction

With the rapid development of information technique, innumerable text data are continuously growing. Sentence similarity aims at calculating the degree of resemblance or distance between two sentences It plays an important role in the application of natural language processing (NLP), like text summarization [1, 2], machine translation [3], question answering systems [4], and information retrieval [5]. Methods based on the deep learning algorithms such as convolutional neural network (CNN) can capture deep semantics information, but most of them are with high time complexity and supervision. Both of the classes of methods cannot commendably process the uncertainty and imprecision of text sentences. Our model can process the uncertainty and imprecision of text data, and overcome the shortcomings mentioned before

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call