Abstract
As information sharing becomes more and more convenient, a lot of phenomena of plagiarism shows up. The study of cross-language plagiarism is an important problem that the whole academic circle tries to solve it collectively. In this paper, a multiple-features based cross-language plagiarism detection model is proposed, which includes cross-language plagiarism candidate retrieval based on multiple features and cross-language plagiarism detection based on dynamic text alignment. For cross-language plagiarism candidate retrieval, it is mainly based on the translation features. What's more, for cross-language plagiarism detection, a text-alignment based similarity analysis was used to filter the final results between the identified paragraphs. In this step, our approach doesn't use a machine translation system to convert longer text, but uses a dictionary to obtain the translation of a single word. Moreover, experimental results show that our method outperforms the previous methods and achieved the best results in four datasets.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.