Abstract

In this work, a non-stationary technique based on the Power method for accelerating the parallel computation of the PageRank vector is proposed and its theoretical convergence analyzed. This iterative non-stationary model, which uses the eigenvector formulation of the PageRank problem, reduces the needed computations for obtaining the PageRank vector by eliminating synchronization points among processes, in such a way that, at each iteration of the Power method, the block of iterate vector assigned to each process can be locally updated more than once, before performing a global synchronization. The parallel implementation of several strategies combining this novel non-stationary approach and the extrapolation methods has been developed using hybrid MPI/OpenMP programming. The experiments have been carried out on a cluster made up of 12 nodes, each one equipped with two Intel Xeon hexacore processors. The behaviour of the proposed parallel algorithms has been studied with realistic datasets, highlighting their performance compared with other parallel techniques for solving the PageRank problem. Concretely, the experimental results show a time reduction of up to 58.4 % in relation to the parallel Power method, when a small number of local updates is performed before each global synchronization, outperforming both the two-stage algorithms and the extrapolation algorithms, more sharply as the number of processes increases.

Highlights

  • The PageRank algorithm is a well-known algorithm used for determining the relevance ofWeb pages [1]

  • The computation of the PageRank vector requires to work with large matrices, so their manipulation in a full format is not appropriate because the memory requirements would be too high

  • For this purpose we have based on the Compressed Sparse Row (CSR) format [33]

Read more

Summary

Introduction

The PageRank algorithm is a well-known algorithm used for determining the relevance ofWeb pages [1]. =1 , where gij = 1 when there is a link from page j to page i, with i 6 = j, n and gij = 0 otherwise. This adjacent matrix leads us to the transition matrix P = [ pij ]i,j. Note that c j represents the number of out-links from a page j. In this way, the PageRank vector is a probability vector x such that Px = x, with k x k1 = ∑in=1 | xi | = ∑in=1 xi = 1.

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.