Simultaneous recognition of phone and speaker using three-way restricted Boltzmann machine

Toru Nakashika,Yasuhiro Minami

doi:10.1121/1.4969521

Abstract

In this study, we attempt a simultaneous recognition method of phone and speaker using a single energy-based model, a three-way restricted Boltzmann machine (3WRBM). The proposed model is a probabilistic model that includes three variables: acoustic features, latent phonetic features, and speaker-identity features. The model is trained so that it automatically captures the intensity of relationships among the three variables. Once the training is done, we can apply the model to many speech signal processing tasks because it has an ability to separate phoneme and speaker-related information from the observed speech, and generate a speech signal from the phoneme and speaker-related information on the contrary. Simultaneous phone and speaker recognition is achieved by estimating the latent phonetic features and the speaker-identity features given the input signal. In our experiments, we discuss the effectiveness of the mode lin a speaker recognition and a speech (continuous phone) recognition tasks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Simultaneous recognition of phone and speaker using three-way restricted Boltzmann machine

Abstract

Talk to us

Similar Papers

More From: Journal of the Acoustical Society of America

Lead the way for us

Similar Papers

Bottleneck and Embedding Representation of Speech for DNN-based Language and Speaker Recognition
Alicia Lozano-Diez ... Joaquin Gonzalez-Rodriguez
-
Alicia Lozano-Diez, et. al.Alicia Lozano-Diez ... Joaquin Gonzalez-Rodriguez
21 Nov 2018
21 Nov 2018

Αναγνώριση ομιλητή και ομιλίας με χρήση κυματιδίων
Μιχάλης Σιαφαρίκας
-
Μιχάλης ΣιαφαρίκαςΜιχάλης Σιαφαρίκας
01 Jan 2009
01 Jan 2009

Discriminative Input Stream Combination for Conditional Random Field Phone Recognition
I Heintz ... E Fosler-Lussier
IEEE Transactions on Audio, Speech, and Language Processing | VOL. 17
I Heintz, et. al.I Heintz ... E Fosler-Lussier
01 Nov 2009
IEEE Transactions on Audio, Speech, and Language Processing | VOL. 17

Exploring convolutional neural network structures and optimization techniques for speech recognition
Ossama Abdel-Hamid ... Dong Yu
-
Ossama Abdel-Hamid, et. al.Ossama Abdel-Hamid ... Dong Yu
25 Aug 2013
25 Aug 2013

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Simultaneous recognition of phone and speaker using three-way restricted Boltzmann machine

Abstract

Talk to us

Similar Papers

More From: Journal of the Acoustical Society of America