A VLSI architecture to achieve real time processing of a full-search block matching algorithm is described. The intensive computations required to extract the motion vector within the full search range in real time are motivated. It uses a parallel algorithm based on the idea of partial result accumulation. The partial sum results of the candidate block distortions are individually accumulated into a cyclic storage buffer for each distortion measure. Based on one dimensional semi-systolic array architecture design, a parameterizable motion estimation processor (MEP) can be designed for motion vector estimation of different reference block sizes and search ranges. Moreover, any number of MEPs can be cascaded for larger tracking areas and high pixel rates. This proposed architecture has the following advantages, 1) it has a serial data inputs to save pin counts but performs parallel processing, 2) it is parameterizable to set the search range, so that it is adaptable to different video applications, 3) it is also possible to operate for any vertical reference block size and horizontal ones of the multiples of basic size N, 3) it can operate in real time for videoconferencing applications, 4) it has a degree of modularity suitable for VLSI implementation, and 5) it can be easily and cost-effectively implemented into VLSI by its simple structure mainly composed of one-dimensional array of processing elements (PEs) and by its low control overhead.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">></ETX>