Duplicate Resource Detection in RDF Datasets Using Hadoop and MapReduce

Kumar Sharma,Ujjal Marjit,Utpal Biswas

doi:10.1007/978-981-10-4765-7_26

Duplicate Resource Detection in RDF Datasets Using Hadoop and MapReduce

Kumar Sharma, Ujjal Marjit + Show 1 more

https://doi.org/10.1007/978-981-10-4765-7_26

Copy DOI

Publication Date: Oct 29, 2017

Affiliation: University of Kalyani

#Resource Description Framework Resources #Resource Description Framework + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

In the Semantic Web community many approaches have been evolved for generating RDF (Resource Description Framework) resources. However, they often capture duplicate resources, that are stored without elimination. In consequence, duplicate resources reduce the data quality as well as increase unnecessary size of the dataset. We propose an approach for detecting duplicate resources in RDF datasets using Hadoop and MapReduce framework. RDF resources are compared using similarity metrics defined at resource level, RDF statement level as well as object level. The performance is evaluated with the evaluation metrics and the experimental evaluation showed the accuracy, effectiveness, and efficiency of the proposed approach.

Full Text