Source Code Plagiarism Detection Method Using Prot�g� Built Ontologies

Ion Smeureanu,Bogdan Iancu

doi:10.12948/issn14531305/17.3.2013.07

Ion Smeureanu, Bogdan Iancu

Open Access

https://doi.org/10.12948/issn14531305/17.3.2013.07

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text
Similar Papers

Abstract

Listen

Software plagiarism is a growing and serious problem that affects computer science universities in particular and the quality of education in general. More and more students tend to copy their thesis 's software from older theses or internet databases. Checking source codes manually, to detect if they are similar or the same, is a laborious and time consuming job, maybe even impossible due to existence of large digital repositories. Ontology is a way of describing a document's semantic, so it can be easily usedfor source code files too. OWL Web Ontology Language could find its applicability in describing both vocabulary and taxonomy of a programming language source code. SPARQL is a query language based on SQL that extracts saved or deducted information from ontologies. Our paper proposes a source code plagiarism detection method, based on ontologies created using Protege editor, which can be applied in scanning students ' theses 'software source code.Keywords: Ontology, OWL, SPARQL, Plagiarism, Protege(ProQuest: ... denotes formulae omitted.)1 IntroductionIn our days we have a huge volume of digital information, thing that can be very useful on one side, but a disadvantage on the other. The useful part is that we can find any needed information more quickly (at a click of a button as we usually say) than in the past by taking advantage of the digital repositories. The disadvantage is that finding similar or duplicated documents is very difficult now, especially when this job is made manually. That is why we try to find alternative solutions in the field of plagiarism detection systems [1],The term is inherited from philosophy where it refers to existence and the things that exist. In computer science those things are represented by data and the ontology generally describes the semantic of terms used in a specific domain (in our case programming), providing a vocabulary for that domain as well as a computerized specification of the meaning of terms used in the vocabulary. Ontologies range from taxonomies and classifications, database schemas, to fully axiomatized theories. In recent years, ontologies have been adopted in many business and scientific communities as a way to share, reuse and process domain knowledge. Ontologies are now central to many applications such as scientific knowledge portals, information management and integration systems, electronic commerce, and semantic web services [2], In our work we will use ontologies for building the knowledge graph specific to each source code that we suspect of plagiarism.OWL Web Ontology Language is a specification by the World Wide Web Consortium (W3C) and serves as a fundamental component of the Semantic Web initiative. OWL is based upon the Extensible Markup Language (XML), XML Schema [3], the Resource Description Framework (RDF) and RDF Schema (RDF-S) [4], It is composed from three sublanguages OWL- Lite, OWL-DL and OWL-Full, from those OWL-DL being the one most often used because it provides maximum expressiveness.The Resource Description Framework (RDF) is a language for representing information about resources in the World Wide Web. It is particularly intended for representing metadata about web resources, such as the title, author, and modification date of a web page, copyright and licensing information about a web document, or the availability schedule for some shared resource [4], However, by generalizing the concept of a web resource, RDF can also be used to represent information about things that can be identified on the web, even when they cannot be directly retrieved on the web.RDF is intended for situations in which this information needs to be processed by applications, rather than being only displayed to people. RDF provides a common framework for expressing this information so it can be exchanged between applications without loss of meaning. Since it is a common framework, application designers can leverage the availability of common RDF parsers and processing tools. …

Highlights

Be identified on the web, even when they cannot be directly retrieved on the web
An ontology consists of a set of classes organized in a subsumption hierarchy to represent a domain's salient concepts, a set of slots associated to classes to describe their properties and relationships, and a set of instances of those classes individual exemplars of the concepts that hold specific values for their properties; the Protégé-OWL editor enables users to build ontologies for the Semantic Web, in particular in the W3C's Web Ontology Language (OWL)
4 Conclusions In this paper it was shown that ontologies can be used in detecting source code plagiarism

Summary

Introduction

Be identified on the web, even when they cannot be directly retrieved on the web. RDF is intended for situations in which this information needs to be processed by applications, rather than being only displayed to people. RDF provides a common framework for expressing this information so it can be exchanged between applications without loss of meaning. Since it is a common framework, application designers can leverage the availability of common RDF parsers and processing tools. We will use RDF and OWL in our method as standards and formats for saving the ontologies created via the Protégé editor. We prefer this approach because they are W3C standards and in this way we can provide interoperability between our work and other future related works. The results of SPARQL queries can be results sets or RDF graphs

Methods

Conclusion

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Informatica Economica	Publication Date: Sep 30, 2013
Citations: 3	License type: cc-by

R Discovery Prime

Source Code Plagiarism Detection Method Using Prot�g� Built Ontologies

Abstract

Highlights

Summary

Published Version

Talk to us

Similar Papers

More From: Informatica Economica

Lead the way for us

Similar Papers

OWL
Adélia Gouveia ... Jorge Cardoso
-
Adélia Gouveia, et. al.Adélia Gouveia ... Jorge Cardoso
01 Jan 2009
01 Jan 2009

Storing and querying fuzzy RDF(S) in HBase databases
Tianyi Fan ... Zongmin Ma
International Journal of Intelligent Systems | VOL. 35
Tianyi Fan, et. al.Tianyi Fan ... Zongmin Ma
24 Jan 2020
International Journal of Intelligent Systems | VOL. 35

XML, RDF, and relatives
M Klein
IEEE Intelligent Systems | VOL. 16
M KleinM Klein
01 Mar 2001
IEEE Intelligent Systems | VOL. 16

RDF(S) Store in Object-Relational Databases
Zongmin Ma ... Jiawen Lu
Journal of Database Management | VOL. 35
Zongmin Ma, et. al.Zongmin Ma ... Jiawen Lu
11 Dec 2023
Journal of Database Management | VOL. 35

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Source Code Plagiarism Detection Method Using Prot�g� Built Ontologies

Abstract

Highlights

Summary

Published Version

Talk to us

Similar Papers

More From: Informatica Economica