Optimized Algorithm Design for Text similarity Detection Based on Artificial Intelligence and Natural Language Processing

Zhe Liu,Jiajia Zhu,Xinyun Cheng,Qingqiang Lu

doi:10.1016/j.procs.2023.11.023

Abstract

This paper presents a Text similarity detection method based on artificial intelligence and natural language processing. The method combines statistical machine learning and deep learning techniques and designs six models from three perspectives: character-level, word-level, and semantic-level. These models include the diff model based on machine learning, cosine similarity model, Jaccard model, TF-IDF model, as well as the SimCSE and SBERT models based on deep learning methods. To fully leverage the characteristics of these models, three scenarios are designed to calculate the similarity scores based on experience and multiple experimental results. The results show that calculating the similarity scores using these three scenarios not only achieves high accuracy but also requires fewer computational resources. As deep learning and natural language processing technologies continue to advance, Text similarity detection methods based on artificial intelligence and natural language processing will continue to be improved and play a more significant role in practice. Future research can explore more models and algorithms to enhance the accuracy and robustness of plagiarism detection.

Full Text