Graphs have become a popular toolbox for capturing similarity relationships among data vectors in a variety of networks. Diffusion processes are used to model the data using the graph topology. A novel approach is put forth that exploits data similarity and diffusion models, in a graph, to improve upon the reconstruction mean-square error (MSE) performance of principal component analysis. The tasks of data dimensionality reduction and reconstruction are formulated as graph matrix filtering operations, that are utilized in a dimensionality reduction framework optimal in the MSE sense. The novel formulation is seeking MSE-optimal filter matrices that minimize the reconstruction MSE of all the graph data. Block coordinate descent techniques are employed in the graph spectral domain MSE cost to recursively determine the MSE-optimal dimensionality reducing and reconstruction filters. Online implementations are put forth to work with continuous streams of graph data, while almost sure convergence to a stationary point of the ensemble reconstruction MSE cost is established. Analysis of the reconstruction MSE reveals that the lowest MSE achieved by the novel approach is not larger than the standard PCA MSE, while there is a bound for the filter orders after which no more MSE reduction occurs. Extensive numerical tests using synthetic data, as well as real image datasets demonstrate the advantage of the novel graph-based framework over standard PCA and related methods.
Read full abstract