Abstract

This paper proposes an algorithm for document plagiarism detection using the provided incremental knowledge construction with formal concept analysis (FCA). The incremental knowledge construction is presented to support document matching between the source document in storage and the suspect document. Thus, a new concept similarity measure is also proposed for retrieving formal concepts in the knowledge construction. The presented concept similarity employs appearance frequencies in the obtained knowledge construction. Our approach can be applied to retrieve relevant information because the obtained structure uses FCA in concept form that is definable by a conjunction of properties. This measure is mathematically proven to be a formal similarity metric. The performance of the proposed similarity measure is demonstrated in document plagiarism detection. Moreover, this paper provides an algorithm to build the information structure for document plagiarism detection. Thai text test collections are used for performance evaluation of the implemented web application.

Highlights

  • Plagiarism has increased because of easy access to data on the World Wide Web

  • Ekbal et al [40] propose a technique based on textual similarity for external plagiarism detection by using a vector space model, which is one technique in information retrieval (IR) to compare source and suspect documents

  • The document plagiarism detection using Formal concept analysis (FCA) is aimed at detecting good matches between the source document in storage and a suspect document

Read more

Summary

Introduction

Plagiarism has increased because of easy access to data on the World Wide Web. This work applied FCA to detect document plagiarism This method provides related documents or groups of documents to the user. The application requires a similarity measure to retrieve source documents or to identify groups of similar documents in a concept hierarchy. Concept similarity of FCA has gained importance from its application to plagiarism detection, which has to assess the similarity between formal concepts to find relevant information. We present and investigate a candidate algorithm to support plagiarism detection with the proposed concept similarity measures.

Formal Concept Analysis
Related Works
The Proposed Document Plagiarism Detection Approach
Result
10. Return SetExt
Implementation and Results
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.