Abstract

We show how to multiply two n×n matrices S and T over semirings in the Congested Clique model, where n nodes communicate in a fully connected synchronous network using O(log⁡n)-bit messages, within O(nz(S)1/3nz(T)1/3/n+1) rounds of communication, where nz(S) and nz(T) denote the number of non-zero elements in S and T, respectively. By leveraging the sparsity of the input matrices, our algorithm greatly reduces communication costs compared with general multiplication algorithms [Censor-Hillel et al. (2015) [9]], and thus improves upon the state-of-the-art for matrices with o(n2) non-zero elements. Moreover, our algorithm exhibits the additional strength of surpassing previous solutions also in the case where only one of the two matrices is such. Particularly, this allows to efficiently raise a sparse matrix to a power greater than 2. As applications, we show how to speed up the computation on non-dense graphs of 4-cycle counting and all-pairs-shortest-paths.Our algorithmic contribution is a new deterministic method of restructuring the input matrices in a sparsity-aware manner, which assigns each node with element-wise multiplication tasks that are not necessarily consecutive but guarantee a balanced element distribution, providing for communication-efficient multiplication.Moreover, this new deterministic method for restructuring matrices may be used to restructure the adjacency matrix of input graphs, enabling faster deterministic solutions for graph related problems. As an example, we present a new sparsity aware, deterministic algorithm which solves the triangle listing problem in O(m/n5/3+1) rounds, a complexity that was previously obtained by a randomized algorithm [Pandurangan et al. (2018) [26]], and that matches the known lower bound of Ω˜(n1/3) when m=n2 of [Izumi and Le Gall (2017) [19], Pandurangan et al. (2018) [26]]. Naturally, our triangle listing algorithm also implies triangle counting within the same complexity of O(m/n5/3+1) rounds, which is (possibly more than) a cubic improvement over the previously known deterministicO(m2/n3)-round algorithm [Dolev et al. (2012) [12]].

Highlights

  • Matrix multiplication is a fundamental algebraic task, with abundant applications to various computations

  • The work of Censor-Hillel et al [7] recently showed that known matrix multiplication algorithms for the parallel setting can be adapted to the distributed Congested Clique model, which consists of n nodes in a fully connected synchronous network, limited by a bandwidth of O(log n) bits per message

  • In this paper we focus our attention on the task of multiplying sparse matrices in the Congested Clique model, providing a novel deterministic algorithm with a round complexity which depends on the sparsity of the input matrices

Read more

Summary

Introduction

Matrix multiplication is a fundamental algebraic task, with abundant applications to various computations. The work of Censor-Hillel et al [7] recently showed that known matrix multiplication algorithms for the parallel setting can be adapted to the distributed Congested Clique model, which consists of n nodes in a fully connected synchronous network, limited by a bandwidth of O(log n) bits per message. This significantly improved the stateof-the-art for a variety of tasks, including triangle and 4-cycle counting, girth computations, and (un)weighted/(un)directed all-pairs-shortest-paths (APSP). Pandurangan et al [23] showed a randomized triangle listing algorithm, with the same round complexity as we obtain

Our contribution
Challenges and Our Techniques
Related work
Preliminaries
Fast Sparse Matrix Multiplication
Fast General Sparse Matrix Multiplication - Algorithm SMM
Fast Sparse Balanced Matrix Multiplication - Algorithm SBMM
Discussion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call