For random walks on graph \(\mathcal{G}\) with \(n\) vertices and \(m\) edges, the mean hitting time \(H_{j}\) from a vertex chosen from the stationary distribution to vertex \(j\) measures the importance for \(j\) , while the Kemeny constant \(\mathcal{K}\) is the mean hitting time from one vertex to another selected randomly according to the stationary distribution. In this paper, we first establish a connection between the two quantities, representing \(\mathcal{K}\) in terms of \(H_{j}\) for all vertices. We then develop an efficient algorithm estimating \(H_{j}\) for all vertices and \(\mathcal{K}\) in nearly linear time of \(m\) . Moreover, we extend the centrality \(H_{j}\) of a single vertex to \(H(S)\) of a vertex set \(S\) , and establish a link between \(H(S)\) and some other quantities. We further study the NP-hard problem of selecting a group \(S\) of \(k\ll n\) vertices with minimum \(H(S)\) , whose objective function is monotonic and supermodular. We finally propose two greedy algorithms approximately solving the problem. The former has an approximation factor \((1-\frac{k}{k-1}\frac{1}{e})\) and \(O(kn^{3})\) running time, while the latter returns a \((1-\frac{k}{k-1}\frac{1}{e}-\epsilon)\) -approximation solution in nearly-linear time of \(m\) , for any parameter \(0 \lt \epsilon \lt 1\) . Extensive experiment results validate the performance of our algorithms.
Read full abstract