We study the problem of sparse-matrix dense-vector multiplication (SpMV) in external memory. The task of SpMV is to compute y:=Ax, where A is a sparse N×N matrix and x is a vector. We express sparsity by a parameter k, and for each choice of k consider the class of matrices where the number of nonzero entries is kN, i.e., where the average number of nonzero entries per column is k.We investigate what is the external worst-case complexity, i.e., the best possible upper bound on the number of I/Os, as a function of k, N and the parameters M (memory size) and B (track size) of the I/O-model. We determine this complexity up to a constant factor for all meaningful choices of these parameters, as long as k≤N 1−ε, where ε depends on the problem variant. Our model of computation for the lower bound is a combination of the I/O-models of Aggarwal and Vitter, and of Hong and Kung.We study variants of the problem, differing in the memory layout of A. If A is stored in column major layout, we prove that SpMV has I/O complexity \(\Theta(\min\{\frac{kN}{B}\max\{1,\log_{M/B}\frac{N}{\max\{k,M\}}\},\,kN\})\) for k≤N 1−ε and any constant 0<ε<1. If the algorithm can choose the memory layout, the I/O complexity reduces to \(\Theta ({\min\{\frac{kN}{B}\max\{1,\log_{M/B}\frac{N}{kM}\},kN\}})\) for \(k\leq\sqrt[3]{N}\). In contrast, if the algorithm must be able to handle an arbitrary layout of the matrix, the I/O complexity is \(\Theta ({\min\{\frac{kN}{B}\max\{1,\log_{M/B}\frac{N}{M}\},kN\}})\) for k≤N/2.In the cache oblivious setting we prove that with tall cache assumption M≥B 1+ε, the I/O complexity is \(\mathcal {O}({\frac{kN}{B}\max\{1,\log_{M/B}\frac{N}{\max\{k,M\}}\}})\) for A in column major layout.
Read full abstract