Online Estimation of Evolving Human Visual Interest

Harish Katti,Ramakrishnan Kalpathi,Mohan Kankanhalli,Anoop Kolar Rajagopal

doi:10.1145/2632284

Abstract

Regions in video streams attracting human interest contribute significantly to human understanding of the video. Being able to predict salient and informative Regions of Interest (ROIs) through a sequence of eye movements is a challenging problem. Applications such as content-aware retargeting of videos to different aspect ratios while preserving informative regions and smart insertion of dialog (closed-caption text) 1 into the video stream can significantly be improved using the predicted ROIs. We propose an interactive human-in-the-loop framework to model eye movements and predict visual saliency into yet-unseen frames. Eye tracking and video content are used to model visual attention in a manner that accounts for important eye-gaze characteristics such as temporal discontinuities due to sudden eye movements, noise, and behavioral artifacts. A novel statistical- and algorithm-based method gaze buffering is proposed for eye-gaze analysis and its fusion with content-based features. Our robust saliency prediction is instantiated for two challenging and exciting applications. The first application alters video aspect ratios on-the-fly using content-aware video retargeting, thus making them suitable for a variety of display sizes. The second application dynamically localizes active speakers and places dialog captions on-the-fly in the video stream. Our method ensures that dialogs are faithful to active speaker locations and do not interfere with salient content in the video stream. Our framework naturally accommodates personalisation of the application to suit biases and preferences of individual users.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Online Estimation of Evolving Human Visual Interest

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Multimedia Computing, Communications, and Applications

Lead the way for us

Journal: ACM Transactions on Multimedia Computing, Communications, and Applications	Publication Date: Aug 1, 2014
Citations: 13

Similar Papers

A Case for Studying Naturalistic Eye and Head Movements in Virtual Environments.
Chloe Callahan-Flintoft ... Christian Barentine
Frontiers in Psychology | VOL. 12
Chloe Callahan-Flintoft, et. al.Chloe Callahan-Flintoft ... Christian Barentine
31 Dec 2021
Frontiers in Psychology | VOL. 12

Multi-microphone simultaneous speakers detection and localization of multi-sources for separation and noise reduction
Ayal Schwartz ... Sharon Gannot
EURASIP Journal on Audio, Speech, and Music Processing | VOL. 2024
Ayal Schwartz, et. al.Ayal Schwartz ... Sharon Gannot
04 Oct 2024
EURASIP Journal on Audio, Speech, and Music Processing | VOL. 2024

AS-Net: active speaker detection using deep audio-visual attention
Abduljalil Radman ... Jorma Laaksonen
Multimedia Tools and Applications | VOL. 83
Abduljalil Radman, et. al.Abduljalil Radman ... Jorma Laaksonen
05 Feb 2024
Multimedia Tools and Applications | VOL. 83

Cross Modal Video Representations for Weakly Supervised Active Speaker Localization
Rahul Sharma ... Shrikanth Narayanan
IEEE Transactions on Multimedia | VOL. 25
Rahul Sharma, et. al.Rahul Sharma ... Shrikanth Narayanan
01 Jan 2023
IEEE Transactions on Multimedia | VOL. 25

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Online Estimation of Evolving Human Visual Interest

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Multimedia Computing, Communications, and Applications