Abstract

Conversation scenes are a typical example in which classical models of visual attention dramatically fail to predict eye positions. Indeed, these models rarely consider faces as particular gaze attractors and never take into account the important auditory information that always accompanies dynamic social scenes. We recorded the eye movements of participants viewing dynamic conversations taking place in various contexts. Conversations were seen either with their original soundtracks or with unrelated soundtracks (unrelated speech and abrupt or continuous natural sounds). First, we analyze how auditory conditions influence the eye movement parameters of participants. Then, we model the probability distribution of eye positions across each video frame with a statistical method (Expectation-Maximization), allowing the relative contribution of different visual features such as static low-level visual saliency (based on luminance contrast), dynamic low level visual saliency (based on motion amplitude), faces, and center bias to be quantified. Through experimental and modeling results, we show that regardless of the auditory condition, participants look more at faces, and especially at talking faces. Hearing the original soundtrack makes participants follow the speech turn-taking more closely. However, we do not find any difference between the different types of unrelated soundtracks. These eyetracking results are confirmed by our model that shows that faces, and particularly talking faces, are the features that best explain the gazes recorded, especially in the original soundtrack condition. Low-level saliency is not a relevant feature to explain eye positions made on social scenes, even dynamic ones. Finally, we propose groundwork for an audiovisual saliency model.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.