Data-access scheme affects the complexity of parallel architectures performing computation of multi-level two-dimensional (2-D) discrete wavelet transform (DWT). In this paper, we made a study on the data-access schemes considered in the existing parallel 2-D DWT architectures. Based on this study, a novel data-access scheme is proposed to avoid data multiplexing, which is common in the existing parallel architectures. Further, a block formulation is presented for vector computation of multi-level lifting 2-D DWT. A generic design of processing unit for computing one-level 2-D DWT is derived using the proposed block formulation. The proposed generic design resizes by a single parameter, i.e., the input-vector size. A regular and modular parallel architecture is derived using the generic processing unit design. The proposed parallel architecture easily scalable for higher block sizes as well as higher DWT levels without sacrificing its circuit regularity and modularity. This is an important feature of the proposed architecture. Comparison result shows that the proposed architecture for three-level DWT and block size 64 offers a saving in 24% area–delay–product (ADP) and 10% power consumption, and higher saving for higher block sizes than the best available similar structure without any overhead memory unlike the existing structure.