The Field-Programmable Gate Array is an excellent match for the Sparse Matrix–Vector Multiply (SMVM) operation because of its enormous computational capacity and its ability to build a custom memory hierarchy that matches the memory access patterns of the operation. This paper describes a new sparse matrix storage format which works in conjunction with a custom memory subsystem which decodes the format on-the-fly. The SMVM operation is implemented on a single FPGA and a small parallel system of four FPGAs. The parameters that affect the performance of the sequential and parallel designs are investigated as well as the speedup for different matrices.