Fusion of audio and visual cues for laughter detection

Stavros Petridis,Maja Pantic

doi:10.1145/1386352.1386396

Abstract

Past research on automatic laughter detection has focused mainly on audio-based detection. Here we present an audio-visual approach to distinguishing laughter from speech and we show that integrating the information from audio and video channels leads to improved performance over single-modal approaches. Each channel consists of 2 streams (cues), facial expressions and head movements for video and spectral and prosodic features for audio. We used decision level fusion to integrate the information from the two channels and experimented using the SUM rule and a neural network as the integration functions. The results indicate that even a simple linear function such as the SUM rule achieves very good performance in audiovisual fusion. We also experimented with different combinations of cues with the most informative being the facial expressions and the spectral features. The best combination of cues is the integration of facial expressions, spectral and prosodic features when a neural network is used as the fusion method. When tested on 96 audiovisual sequences, depicting spontaneously displayed (as opposed to posed) laughter and speech episodes, in a person independent way the proposed audiovisual approach achieves over 90% recall rate and over 80% precision.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Fusion of audio and visual cues for laughter detection

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Audiovisual laughter detection based on temporal features
Stavros Petridis ... Maja Pantic
-
Stavros Petridis, et. al.Stavros Petridis ... Maja Pantic
20 Oct 2008
20 Oct 2008

Audiovisual Discrimination Between Speech and Laughter: Why and When Visual Information Might Help
Stavros Petridis ... Maja Pantic
IEEE Transactions on Multimedia | VOL. 13
Stavros Petridis, et. al.Stavros Petridis ... Maja Pantic
01 Apr 2011
IEEE Transactions on Multimedia | VOL. 13

How does real affect affect affect recognition in speech?
Khiet Truong
-
Khiet TruongKhiet Truong
12 May 2017
12 May 2017

Speech Emotion Recognition Using Both Spectral and Prosodic Features
Yu Zhou ... Jianping Zhang
-
Yu Zhou, et. al.Yu Zhou ... Jianping Zhang
01 Dec 2009
01 Dec 2009

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Fusion of audio and visual cues for laughter detection

Abstract

Talk to us

Similar Papers