Abstract

Crowd saliency prediction refers to predicting where people look at in crowd scene. Humans have remarkable ability to rapidly direct their gaze to select visual information of interest when looking at a visual scene. Until now, research efforts are still focused on what type of feature is representative for crowd saliency, and which type of learning model is robust for crowd saliency prediction. In this paper, we propose a Random Forest (RF) based crowd saliency prediction approach with optimal feature combination, i.e., the Feature Combination Selection for Crowd Saliency (FCSCS) framework. More specifically, we first define three representative crowd saliency features, namely, FaceSizeDiff, FacePoseDiff and FaceWhrDiff. Next, we adopt the Random Forest (RF) algorithm to construct our saliency learning model. Then, we evaluate the performance of FCSCS framework with different feature combinations (fifteen combinations in our experiments). Those selected features include low-level features (i.e., color, intensity, orientation), four crowd features (i.e., face size, face density, frontal face, profile face) and three new defined features (i.e., FaceSizeDiff, FacePoseDiff and FaceWhrDiff). We use FCSCS framework to obtain the optimal feature combination that is most suitable for crowd saliency prediction and further train the saliency model based on the optimal feature combination. After that, we evaluate the performance of the crowd saliency prediction classifiers. Finally, we conduct extensive experiments and empirical evaluation to demonstrate the satisfactory performance of our approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call