음성인식기 성능 향상을 위한 영상기반 음성구간 검출 및 적응적 문턱값 추정

Taeyup Song,Sung Soo Kim,Jae-Won Lee,Hanseok Ko,Kyungsun Lee

doi:10.7776/ask.2015.34.4.321

Abstract

In this paper, we propose an algorithm for achieving robust Visual Voice Activity Detection (VVAD) for enhanced speech recognition. In conventional VVAD algorithms, the motion of lip region is found by applying an optical flow or Chaos inspired measures for detecting visual speech frames. The optical flow-based VVAD is difficult to be adopted to driving scenarios due to its computational complexity. While invariant to illumination changes, Chaos theory based VVAD method is sensitive to motion translations caused by driver`s head movements. The proposed Local Variance Histogram (LVH) is robust to the pixel intensity changes from both illumination change and translation change. Hence, for improved performance in environmental changes, we adopt the novel threshold estimation using total variance change. In the experimental results, the proposed VVAD algorithm achieves robustness in various driving situations.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

음성인식기 성능 향상을 위한 영상기반 음성구간 검출 및 적응적 문턱값 추정

Abstract

Talk to us

Similar Papers

More From: The Journal of the Acoustical Society of Korea

Lead the way for us

Journal: The Journal of the Acoustical Society of Korea	Publication Date: Jul 31, 2015
Citations: 10

Similar Papers

Visual voice activity detection via chaos based lip motion measure robust under illumination changes
Taeyup Song ... Kyungsun Lee
IEEE Transactions on Consumer Electronics | VOL. 60
Taeyup Song, et. al.Taeyup Song ... Kyungsun Lee
01 May 2014
IEEE Transactions on Consumer Electronics | VOL. 60

Learning Visual Voice Activity Detection with an Automatically Annotated Dataset
Sylvain Guy ... Stephane Lathuiliere
-
Sylvain Guy, et. al.Sylvain Guy ... Stephane Lathuiliere
10 Jan 2021
10 Jan 2021

Visual Voice Activity Detection Using Frontal versus Profile Views
Rajitha Navarathna ... Clinton Fookes
-
Rajitha Navarathna, et. al.Rajitha Navarathna ... Clinton Fookes
01 Dec 2011
01 Dec 2011

Learning Visual Voice Activity Detection with an Automatically Annotated Dataset
...
-
, et. al. ...
29 Dec 2020
29 Dec 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

음성인식기 성능 향상을 위한 영상기반 음성구간 검출 및 적응적 문턱값 추정

Abstract

Talk to us

Similar Papers

More From: The Journal of the Acoustical Society of Korea