Mining Uncertain Graphs: An Overview

Vasileios Kassiano,Apostolos N Papadopoulos,Kostas Tsichlas,Anastasios Gounaris

doi:10.1007/978-3-319-57045-7_6

Abstract

Graphs play an important role in modern world, due to their widespread use for modeling, representing and organizing linked data. Taking into consideration that most of the “killer” applications require a graph-based representation (e.g., the Web, social network management, protein-protein interaction networks), efficient query processing and analysis techniques are required, not only because these graphs are massive but also because the operations that must be supported are complex, requiring significant computational resources. In many cases, each graph edge e is annotated by a probability value p(e), expressing its existential uncertainty. This means that with probability p(e) the edge will be present in the graph and with probability \(1-p(e)\) the edge will be absent. This gives rise to the concept of probabilistic graphs (also known as uncertain graphs). Formally, a probabilistic graph \(\mathcal{G}\) is a triplet (V, E, p) where V is the set of nodes, E is the set of edges and \(p: E \rightarrow (0,1]\). The main challenge posed by this formulation is that problems that are relatively easy to solve in exact graphs become very difficult (or even intractable) in probabilistic graphs. In this paper, we perform an overview of the algorithmic techniques proposed in the literature for uncertain graph analysis. In particular, we center our focus on the following graph mining tasks: clustering, maximal cliques, k-nearest neighbors and core decomposition. We conclude the paper with a short discussion related to distributed mining of uncertain graphs which is expected to achieve significant performance improvements.

Full Text