As a requirement of many modern image compression standards faced today, a computational complexity is observed due to the best mode selection in the intra-prediction stage. This computational complexity is tried to be reduced by various techniques without affecting the performance criteria of the image. In this study, a fast search algorithm, which simplifies the mode selection process of the intra-prediction algorithm and provides calculation with less number of modes is proposed. The hardware architecture of this proposed algorithm is implemented for realization. There are two main sections of the intra-prediction algorithm in image compression, namely the image prediction process and the mode selection process. In this study, main objective is to reduce the process time of the mode selection and the simplification of the hardware design. Sum of absolute difference (SAD) is a frequently used criterion to simplify hardware design. The algorithm searches for the most suitable mode in a single step, where the decision is based on the SAD criterion preferred for the simplicity. The proposed algorithm and related hardware architecture is tested by using various experiments. The number of the modes calculated is reduced effectively, while the process is kept within the acceptable limits in terms of peak signal to noise ratio (PSNR) and compression rate (CR) performance criteria. Therefore, the number of clock cycles observed is considerably reduced. The designed architecture is synthesized for the field programmable gate arrays (FPGA) board and the obtained results are given. In addition, these results are compared with the HM reference software where the corresponding results are in accordance with the reference software.