We propose a new graph metric and study its properties. In contrast to the standard distance in connected graphs, it takes into account all paths between vertices. Formally, it is defined as d(i,j)=q_{ii}+q_{jj}-q_{ij}-q_{ji}, where q_{ij} is the (i,j)-entry of the {\em relative forest accessibility matrix} Q(\epsilon)=(I+\epsilon L)^{-1}, L is the Laplacian matrix of the (weighted) (multi)graph, and \epsilon is a positive parameter. By the matrix-forest theorem, the (i,j)-entry of the relative forest accessibility matrix of a graph provides the specific number of spanning rooted forests such that i and j belong to the same tree rooted at i. Extremely simple formulas express the modification of the proposed distance under the basic graph transformations. We give a topological interpretation of d(i,j) in terms of the probability of unsuccessful linking i and j in a model of random links. The properties of this metric are compared with those of some other graph metrics. An application of this metric is related to clustering procedures such as "centered partition." In another procedure, the relative forest accessibility and the corresponding distance serve to choose the centers of the clusters and to assign a cluster to each non-central vertex. The notion of cumulative weight of connections between two vertices is proposed. The reasoning involves a reciprocity principle for weighted multigraphs. Connections between the resistance distance and the forest distance are established.
Read full abstract