<p>Video saliency detection is a rapidly growing subject that has seen very few contributions. The most common technique used nowadays is to perform frame-by-frame saliency detection. The modified Spatio-temporal fusion method presented in this paper offers a novel approach to saliency detection and mapping. It uses frame-wise overall motion color saliency as well as pixel-based consistent Spatio-temporal diffusion for its temporal uniformity. Additionally, a variety of techniques is advocated as a way to increase the saliency maps' overall accuracy and precision. The video is divided into groups of frames, and each frame temporarily goes through diffusion and integration in order to compute the color saliency mapping, as covered in the proposed method section. Then, with the aid of a permutation matrix, the inter-group frame is used to format the pixel-based saliency fusion, after which the features, or the fusion of pixel saliency and color information, direct the diffusion of the spatiotemporal saliency. The result is tested using five publicly accessible global saliency evaluation metrics, and it is determined that the proposed algorithm outperforms numerous saliency detection techniques with an improvement in accuracy margin. The robustness, dependability, adaptability, and precision are all demonstrated by the results.</p>
Read full abstract