‘vVISWa’ – A Multilingual Multi-Pose Audio Visual Database for Robust Human Computer Interaction

Bharti Gawali,Prashant Borde,Pravin Yannawar,Ramesh Manza

doi:10.5120/ijca2016908696

Abstract

Automatic Speech Recognition (ASR) by machine is an attractive research topic in signal processing domain and has attracted many researchers to contribute in this area of signal processing and pattern recognition. In recent year, there have been many advances in automatic speech reading system with the inclusion of audio and visual speech features to recognize words under noisy conditions. The objective of audio-visual speech recognition system is to improve recognition accuracy. In order to develop robust AVSR systems under Human Computer Interaction an appropriate simultaneously recorded speech and video data are needed. This paper describes a „vVISWa‟ (Visual Vocabulary of Independent Standard Words) database consists of audio visual data of 48 native speakers and 10 nonnative speakers. These speakers have contributed towards development of corpus in three profiles that is full frontal, 45 profile and side pose. This database was primarily designed to deal with Multi-pose Audio Visual Speech Recognition system for three languages that is, „Marathi‟ (The Native language of Maharashtra), „Hindi‟ (National Language of India) and „English‟ (Universal language). This database is multi-pose, multi-lingual database formed in Indian context. This database available by request from http://visbamu.in/viswaDataset.html.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

‘vVISWa’ – A Multilingual Multi-Pose Audio Visual Database for Robust Human Computer Interaction

Abstract

Talk to us

Similar Papers

More From: International Journal of Computer Applications

Lead the way for us

Journal: International Journal of Computer Applications	Publication Date: Mar 17, 2016
Citations: 12

Similar Papers

RBF neural network mouth tracking for audio-visual speech recognition system
Lim Ee Hui ... K.P Seng
-
Lim Ee Hui, et. al. Lim Ee Hui ... K.P Seng
01 Jan 2004
01 Jan 2004

Audio-Visual Recognition of Overlapped Speech for the LRS2 Dataset
Jianwei Yu ... Xunying Liu
-
Jianwei Yu, et. al.Jianwei Yu ... Xunying Liu
01 May 2020
01 May 2020

Measuring the effect of high-speed video data on the audio-visual speech recognition accuracy
D V Ivanko ... M Zelezny
Information and Control Systems | VOL. -
D V Ivanko, et. al.D V Ivanko ... M Zelezny
19 Apr 2019
Information and Control Systems | VOL. -

Automatic Segmented-Syllable and Deep Learning-Based Indonesian Audiovisual Speech Recognition
Suyanto Suyanto ... Kurniawan Nur Ramadhani
-
Suyanto Suyanto, et. al.Suyanto Suyanto ... Kurniawan Nur Ramadhani
14 Dec 2020
14 Dec 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

‘vVISWa’ – A Multilingual Multi-Pose Audio Visual Database for Robust Human Computer Interaction

Abstract

Talk to us

Similar Papers

More From: International Journal of Computer Applications