Abstract

Ranking is one of the key problems in information retrieval. Recently, there has been significant interest in a class of ranking algorithms based on the assumption that data is sampled from a low dimensional manifold embedded in a higher dimensional Euclidean space.In this paper, we study a popular graph Laplacian based ranking algorithm [23] using an analytical method, which provides theoretical insights into the ranking algorithm going beyond the intuitive idea of diffusion. Our analysis shows that the algorithm is sensitive to a commonly used parameter due to the use of symmetric normalized graph Laplacian. We also show that the ranking function may diverge to infinity at the query point in the limit of infinite samples. To address these issues, we propose an improved ranking algorithm on manifolds using Green's function of an iterated unnormalized graph Laplacian, which is more robust and density adaptive, as well as pointwise continuous in the limit of infinite samples.We also for the first time in the ranking literature empirically explore two variants from a family of twice normalized graph Laplacians. Experimental results on text and image data support our analysis, which also suggest the potential value of twice normalized graph Laplacians in practice.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call