Mining the Semantic Web

Achim Rettinger,Nicola Fanizzi,Uta Lösch,Volker Tresp,Claudia D’Amato

doi:10.1007/s10618-012-0253-2

Abstract

In the Semantic Web vision of the World Wide Web, content will not only be accessible to humans but will also be available in machine interpretable form as ontological knowledge bases. Ontological knowledge bases enable formal querying and reasoning and, consequently, a main research focus has been the investigation of how deductive reasoning can be utilized in ontological representations to enable more advanced applications. However, purely logic methods have not yet proven to be very effective for several reasons: First, there still is the unsolved problem of scalability of reasoning to Web scale. Second, logical reasoning has problems with uncertain information, which is abundant on Semantic Web data due to its distributed and heterogeneous nature. Third, the construction of ontological knowledge bases suitable for advanced reasoning techniques is complex, which ultimately results in a lack of such expressive real-world data sets with large amounts of instance data. From another perspective, the more expressive structured representations open up new opportunities for data mining, knowledge extraction and machine learning techniques. If moving towards the idea that part of the knowledge already lies in the data, inductive methods appear promising, in particular since inductive methods can inherently handle noisy, inconsistent, uncertain and missing data. While there has been broad coverage of inducing concept structures from less structured sources (text, Web pages), like in ontology learning, given the problems mentioned above, we focus on new methods for dealing with Semantic Web knowledge bases, relying on statistical inference on their standard representations. We argue that machine learning research has to offer a wide variety of methods applicable to different expressivity levels of Semantic Web knowledge bases: ranging from weakly expressive but widely available knowledge bases in RDF to highly expressive first-order knowledge bases, this paper surveys statistical approaches to mining the Semantic Web. We specifically cover similarity and distance-based methods, kernel machines, multivariate prediction models, relational graphical models and first-order probabilistic learning approaches and discuss their applicability to Semantic Web representations. Finally we present selected experiments which were conducted on Semantic Web mining tasks for some of the algorithms presented before. This is intended to show the breadth and general potential of this exiting new research and application area for data mining.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Mining the Semantic Web

Abstract

Talk to us

Similar Papers

More From: Data Mining and Knowledge Discovery

Lead the way for us

Journal: Data Mining and Knowledge Discovery	Publication Date: Feb 10, 2012
Citations: 134

Similar Papers

Knowledge extraction from unstructured data and classification through distributed ontologies

-

01 Jan 2012
01 Jan 2012

RDFKB: a semantic web knowledge base
...
-
, et. al. ...
16 Jul 2011
16 Jul 2011

Induction of robust classifiers for web ontologies through kernel machines
Nicola Fanizzi ... Floriana Esposito
Journal of Web Semantics | VOL. 11
Nicola Fanizzi, et. al.Nicola Fanizzi ... Floriana Esposito
22 Nov 2011
Journal of Web Semantics | VOL. 11

Induction of Robust Classifiers for Web Ontologies Through Kernel Machines
Nicola Fanizzi ... Floriana Esposito
SSRN Electronic Journal | VOL. -
Nicola Fanizzi, et. al.Nicola Fanizzi ... Floriana Esposito
01 Jan 2012
SSRN Electronic Journal | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Mining the Semantic Web

Abstract

Talk to us

Similar Papers

More From: Data Mining and Knowledge Discovery