Viseme Recognition System Based on Transformed Acoustic Models

A Zgank,Z Kacic

doi:10.5755/j01.eee.19.9.5657

Abstract

Viseme recognition from speech is one of the methods needed to operate a talking head system, which can be used in various areas, such as mobile services and applications, gaming, the entertainment industry, and so on. This paper proposes a novel method for generating acoustic models for viseme recognition from speech. The viseme acoustic models were generated using transformations from trained phoneme acoustic models. The proposed transformation method is language-independent; only the available speech resources are needed. The viseme sequence with corresponding time information was produced as a result of recognition using context-dependent acoustic models. The evaluation of the proposed acoustic models’ transformation method was carried out on a test scenario with phonetically balanced words, in which the results were compared to the baseline viseme recognition system. The improvement in viseme accuracy was statistically significant when using the proposed method for transforming acoustic models. DOI: http://dx.doi.org/10.5755/j01.eee.19.9.5657

Highlights

Advanced human computer interfaces may include a virtual assistant [1] to achieve natural communication with the user
If a speech synthesis module is used for generating the speech signal, a viseme sequence with time boundaries can be produced using mapping from the phoneme sequence, which is usually an intermediate result of a speech synthesis algorithm [9]
The results are presented as viseme accuracy (VA), which is defined as VA(%) N S D I 100, (6)

Summary

INTRODUCTION

Advanced human computer interfaces may include a virtual assistant [1] to achieve natural communication with the user. A sequence of visemes with appropriate time boundaries is needed to model the movement of a talking head’s mouth [7], [8]. If the recorded or live speech signal is used for the talking head’s spoken modality, viseme recognition must be carried out in order to produce a viseme sequence with time information [10]. With emphasis on the acoustic modeling of visemes in speech. This paper proposes a new viseme acoustic modeling method, in which visemes are transformed from phoneme acoustic models and trained through subsequent steps as context-dependent viseme acoustic models. The proposed acoustic models’ transformation method is languageindependent, and can be used for any language with available speech resources.

VISEME SPEECH RECOGNITION

SPEECH DATABASE

EXPERIMENTAL SYSTEM

RESULTS

CONCLUSIONS

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Viseme Recognition System Based on Transformed Acoustic Models

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics and Electrical Engineering

Lead the way for us

Journal: Electronics and Electrical Engineering	Publication Date: Nov 11, 2013
License type: cc-by

Similar Papers

Context-dependent acoustic models for Chinese speech recognition
Bin Ma ... Xijun Zhang
-
Bin Ma, et. al. Bin Ma ... Xijun Zhang
07 May 1996
07 May 1996

Gaussian free cluster tree construction using deep neural network
Linchen Zhu ... Kevin Kilgour
-
Linchen Zhu, et. al.Linchen Zhu ... Kevin Kilgour
06 Sep 2015
06 Sep 2015

Exploring recurrent neural network based acoustic and linguistic modeling for children's speech recognition
Sreeram Ganji ... Rohit Sinha
-
Sreeram Ganji, et. al.Sreeram Ganji ... Rohit Sinha
01 Nov 2017
01 Nov 2017

Building Acoustic and Language Model for Continuous Speech Recognition in Bahasa Indonesia
Andreas Widjaja ... Vincent Elbert Budiman
Jurnal Teknik Informatika dan Sistem Informasi | VOL. 6
Andreas Widjaja, et. al.Andreas Widjaja ... Vincent Elbert Budiman
10 Aug 2020
Jurnal Teknik Informatika dan Sistem Informasi | VOL. 6

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Viseme Recognition System Based on Transformed Acoustic Models

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics and Electrical Engineering