Saliency Prediction Research Articles

Augmented reality (AR) overlays digital content onto reality. In an AR system, correct and precise estimations of user visual fixations and head movements can enhance the quality of experience by allocating more computational resources for analyzing, rendering, and 3D registration on the areas of interest. However, there is inadequate research to help in understanding the visual explorations of the users when using an AR system or modeling AR visual attention. To bridge the gap between the saliency prediction on real-world scenes and on scenes augmented by virtual information, we construct the ARVR saliency dataset. The virtual reality (VR) technique is employed to simulate the real-world. Annotations of object recognition and tracking as augmented contents are blended into omnidirectional videos. The saliency annotations of head and eye movements for both original and augmented videos are collected and together constitute the ARVR dataset. We also design a model that is capable of solving the saliency prediction problem in AR. Local block images are extracted to simulate the viewport and offset the projection distortion. Conspicuous visual cues in the local block images are extracted to constitute the spatial features. The optical flow information is estimated as an important temporal feature. We also consider the interplay between virtual information and reality. The composition of the augmentation information is distinguished, and the joint effects of adversarial augmentation and complementary augmentation are estimated. The Markov chain is constructed with block images as graph nodes. In the determination of the edge weights, both the characteristics of the viewing behaviors and the visual saliency mechanisms are considered. The order of importance for block images is estimated through the state of equilibrium of the Markov chain. Extensive experiments are conducted to demonstrate the effectiveness of the proposed method.

Image saliency detection, to which much effort has been devoted in recent years, has advanced significantly. In contrast, the community has paid little attention to video saliency detection. Especially, existing video saliency models are very likely to fail in videos with difficult scenarios such as fast motion, dynamic background, and nonrigid deformation. Furthermore, performing video saliency detection directly using image saliency models that ignore video temporal information is inappropriate. To alleviate this issue, this study proposes a novel end-to-end spatiotemporal integration network (STI-Net) for detecting salient objects in videos. Specifically, our method is made up of three key steps: feature aggregation, saliency prediction, and saliency fusion, which are used sequentially to generate spatiotemporal deep feature maps, coarse saliency predictions, and the final saliency map. The key advantage of our model lies in the comprehensive exploration of spatial and temporal information across the entire network, where the two features interact with each other in the feature aggregation step, are used to construct boundary cue in the saliency prediction step, and also serve as the original information in the saliency fusion step. As a result, the generated spatiotemporal deep feature maps can precisely and completely characterize the salient objects, and the coarse saliency predictions have well-defined boundaries, effectively improving the final saliency map's quality. Furthermore, “shortcut connections” are introduced into our model to make the proposed network easy to train and obtain accurate results when the network is deep. Extensive experimental results on two publicly available challenging video datasets demonstrate the effectiveness of the proposed model, which achieves comparable performance to state-of-the-art saliency models.

Saliency Prediction Research Articles

Related Topics

Articles published on Saliency Prediction

A Trained Humanoid Robot can Perform Human-Like Crossmodal Social Attention and Conflict Resolution

Loop closure detection with patch-level local features and visual saliency prediction

Global–local–global context-aware network for salient object detection in optical remote sensing images

Toward Visual Behavior and Attention Understanding for Augmented 360 Degree Videos

Simulating Urban Element Design with Pedestrian Attention: Visual Saliency as Aid for More Visible Wayfinding Design

Quality-Driven Dual-Branch Feature Integration Network for Video Salient Object Detection

STI-Net: Spatiotemporal integration network for video saliency detection

Specificity-preserving RGB-D saliency detection

Advertising Image Saliency Prediction Method Based on Score Level Fusion

An Energy-Based Prior for Generative Saliency.

Saliency Prediction in Uncategorized Videos Based on Audio-Visual Correlation

S $^3$ Net: Self-Supervised Self-Ensembling Network for Semi-Supervised RGB-D Salient Object Detection

Exploring viewport features for semi-supervised saliency prediction in omnidirectional images

Spatio-Temporal Self-Attention Network for Video Saliency Prediction

Viewing Bias Matters in 360° Videos Visual Saliency Prediction

A geospatial image based eye movement dataset for cartography and GIS

Intra- and Inter-Reasoning Graph Convolutional Network for Saliency Prediction on 360° Images

Two‐stage local attention network for salient object detection in remote sensing images

Attention-based pyramid decoder network for salient object detection in remote sensing images

Describing UI Screenshots in Natural Language

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Saliency Prediction Research Articles

Related Topics

Articles published on Saliency Prediction

A Trained Humanoid Robot can Perform Human-Like Crossmodal Social Attention and Conflict Resolution

Loop closure detection with patch-level local features and visual saliency prediction

Global–local–global context-aware network for salient object detection in optical remote sensing images

Toward Visual Behavior and Attention Understanding for Augmented 360 Degree Videos

Simulating Urban Element Design with Pedestrian Attention: Visual Saliency as Aid for More Visible Wayfinding Design

Quality-Driven Dual-Branch Feature Integration Network for Video Salient Object Detection

STI-Net: Spatiotemporal integration network for video saliency detection

Specificity-preserving RGB-D saliency detection

Advertising Image Saliency Prediction Method Based on Score Level Fusion

An Energy-Based Prior for Generative Saliency.

Saliency Prediction in Uncategorized Videos Based on Audio-Visual Correlation

S $^3$ Net: Self-Supervised Self-Ensembling Network for Semi-Supervised RGB-D Salient Object Detection

Exploring viewport features for semi-supervised saliency prediction in omnidirectional images

Spatio-Temporal Self-Attention Network for Video Saliency Prediction

Viewing Bias Matters in 360° Videos Visual Saliency Prediction

A geospatial image based eye movement dataset for cartography and GIS

Intra- and Inter-Reasoning Graph Convolutional Network for Saliency Prediction on 360° Images

Two‐stage local attention network for salient object detection in remote sensing images

Attention-based pyramid decoder network for salient object detection in remote sensing images

Describing UI Screenshots in Natural Language