Abstract
Saliency detection simulates human's perception in locating crucial regions, enabling further processing for many practical applications. Even though saliency prediction for conventional 2D images and videos have been well developed, prediction on 360° contents is still challenging. For each pixel in the equirectangular frame, there will be corresponding surrounding pixels according to their spherical coordinate. Therefore, the conventional convolution method may induce certain inaccuracy in attempt to simulate humans perceive the surrounding environment. This paper proposes a novel spherical convolutional network concentrating on 360° video saliency prediction in which the kernel is defined as a spherical cap. In the process of convolution, instead of using neighboring pixels with regular relationship in the equirectangular projection coordinate, the convolutional patches will be changed to preserve the spherical perspective of the spherical signal. Our model is trained and tested on the dataset including 104 360° videos that comprise dynamic sporty content. The proposed spherical convolutional network is evaluated by Pearson correlation coefficient (CC) and Kullback-Leibler divergence (KLD). Our experiments show the efficiency of our pro-posed spherical convolution method's application in 360° video saliency detection utilizing spherical U-net model. Further analysis on the proposed system have been presented in this study.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.