Abstract

The assessment of node-to-node similarities based on graph topology arises in a myriad of applications, e.g. , web search. SimRank is a notable measure of this type, with the intuition that "two nodes are similar if their in-neighbors are similar". While most existing work retrieving SimRank only considers all-pairs SimRank s (*, *) and single-source SimRank s (*, j ) (scores between every node and query j ), there are appealing applications for partial-pairs SimRank, e.g. , similarity join. Given two node subsets A and B in a graph, partial-pairs SimRank assessment aims to retrieve only { s ( a , b )} ∀ a ε A ,∀ b ε B . However, the best-known solution appears not self-contained since it hinges on the premise that the SimRank scores with node-pairs in an h -go cover set must be given beforehand. This paper focuses on efficient assessment of partial-pairs SimRank in a self-contained manner. (1) We devise a novel "seed germination" model that computes partial-pairs SimRank in O ( k | E | min{| A |, | B |}) time and O (| E | + k | V |) memory for k iterations on a graph of | V | nodes and | E | edges. (2) We further eliminate unnecessary edge access to improve the time of partial-pairs SimRank to O ( m min{| A |, | B |}), where m ≤ min{ k | E |, Δ 2 k }, and Δ is the maximum degree. (3) We show that our partial-pairs SimRank model also can handle the computations of all-pairs and single-source SimRanks. (4) We empirically verify that our algorithms are (a) 38x faster than the best-known competitors, and (b) memory-efficient, allowing scores to be assessed accurately on graphs with tens of millions of links.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.