Creating voiD descriptions for Web-scale data

Christoph Böhm,Johannes Lorey,Felix Naumann

doi:10.1016/j.websem.2011.06.001

Abstract

When working with large amounts of crawled semantic data as provided by the Billion Triple Challenge (BTC), it is desirable to present the data in a manner best suited for end users. This includes conceiving and presenting explanatory metainformation. The Vocabulary of Interlinked Data (voiD) has been proposed as a means to annotate sets of RDF resources to facilitate not only human understanding, but also query optimization. In this article we introduce tools that automatically generate voiD descriptions for large datasets. Our approach comprises different means to identify (sub)datasets and annotate the derived subsets according to the voiD specification. Due to the complexity of Web-scale Linked Data, all algorithms used for partitioning and augmenting are implemented in a cloud environment utilizing the MapReduce paradigm. We employed the Billion Triple Challenge 2010 dataset [6] to evaluate our approach, and present the results in this article. We have released a tool named voiDgen to the public that allows the generation of metainformation for such large datasets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Creating voiD descriptions for Web-scale data

Abstract

Talk to us

Similar Papers

More From: Journal of Web Semantics

Lead the way for us

Journal: Journal of Web Semantics	Publication Date: Jun 15, 2011
Citations: 64

Similar Papers

Creating voiD Descriptions for Web-scale Data
Christoph Bbhm ... Felix Naumann
SSRN Electronic Journal | VOL. -
Christoph Bbhm, et. al.Christoph Bbhm ... Felix Naumann
27 May 2011
SSRN Electronic Journal | VOL. -

SemaPlorer - Interactive Semantic Exploration of Data and Media Based on a Federated Cloud Infrastructure
Simon Schenk ... Carsten Saathoff
SSRN Electronic Journal | VOL. -
Simon Schenk, et. al.Simon Schenk ... Carsten Saathoff
01 Jan 2009
SSRN Electronic Journal | VOL. -

SemaPlorer—Interactive semantic exploration of data and media based on a federated cloud infrastructure
Simon Schenk ... Ansgar Scherp
Web Semantics: Science, Services and Agents on the World Wide Web | VOL. 7
Simon Schenk, et. al.Simon Schenk ... Ansgar Scherp
09 Oct 2009
Web Semantics: Science, Services and Agents on the World Wide Web | VOL. 7

A technique for parallel query optimization using MapReduce framework and a semantic-based clustering method.
Elham Azhir ... Arash Sharifi
PeerJ Computer Science | VOL. 7
Elham Azhir, et. al.Elham Azhir ... Arash Sharifi
01 Jun 2021
PeerJ Computer Science | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Creating voiD descriptions for Web-scale data

Abstract

Talk to us

Similar Papers

More From: Journal of Web Semantics