This study presents the hardware architecture for 16-bit, 5 × 5 fixed-point 2D Gaussian kernel. Two filters are proposed, one using generalized kernel and other using separable kernel. The quality analysis of both the filters are obtained for different metrics and it is observed that the generalized filter achieves the best performance compared to all the existing methods. The proposed filters are evaluated on different images and its performance is similar to that of the original filter, evident from the difference in values of Peak Signal to Noise Ratio (PSNR) and Structural Similarity Index Metric (SSIM) of proposed filters to that of the original filter. Based on the principle of Distributed Arithmetic (DA), two hardware architectures, Generalized Filter Architecture (GFA) and Separable Filter Architecture (SFA) are proposed. SFA achieves an improvement in area, delay, power, Area Delay Product (ADP) and Area Power Product (APP) compared to GFA. GFA, on the other hand, achieves high performance compared to the SFA model. The performance of both the architectures are analyzed in terms of max speed and frames per second metric and it is shown that the proposed architectures achieve significantly better performance compared to all the existing techniques.