Audio head pose estimation using the direct to reverberant speech ratio

Mark Barnard,Josef Kittler,Wenwu Wang

doi:10.1109/icassp.2013.6639234

Abstract

Head pose is an important cue in many applications such as, speech recognition and face recognition. Most approaches to head pose estimation to date have used visual information to model and recognise a subject's head in different configurations. These approaches have a number of limitations such as, inability to cope with occlusions, changes in the appearance of the head, and low resolution images. We present here a novel method for determining coarse head pose orientation purely from audio information, exploiting the direct to reverberant speech energy ratio (DRR) within a highly reverberant meeting room environment. Our hypothesis is that a speaker facing towards a microphone will have a higher DRR and a speaker facing away from the microphone will have a lower DRR. This hypothesis is confirmed by experiments conducted on the publicly available AV16.3 database.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Audio head pose estimation using the direct to reverberant speech ratio

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Audio head pose estimation using the direct to reverberant speech ratio
Mark Barnard ... Wenwu Wang
Speech Communication | VOL. 85
Mark Barnard, et. al.Mark Barnard ... Wenwu Wang
28 Sep 2016
Speech Communication | VOL. 85

A probabilistic framework for joint head tracking and pose estimation
...
-
, et. al. ...
23 Aug 2004
23 Aug 2004

A probabilistic framework for joint head tracking and pose estimation
S.O Ba ... J.M Odobez
-
S.O Ba, et. al.S.O Ba ... J.M Odobez
01 Jan 2004
01 Jan 2004

Joint Head Tracking and Pose Estimation for Visual Focus of Attention Recognition

-

01 Jan 2007
01 Jan 2007

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Audio head pose estimation using the direct to reverberant speech ratio

Abstract

Talk to us

Similar Papers