Decision‐tree‐based F0 quantization for hidden Markov model‐based speech coding at 100 bit/s

Yoshihiro Itogawa,Yoshihiko Nankaku,Akinobu Li,Heiga Zen,Keiichi Tokuda

doi:10.1121/1.4787195

Abstract

A decision‐tree‐based quantization scheme for a very low bit rate speech coder based on HMMs is described. The encoder carries out HMM‐based phoneme recognition and then recognized phonemes, state durations, and F0 sequence are quantized, Huffman coded, and transmitted. In the decoder, sequences of mel‐cepstral coefficient vectors and F0’s are generated from the concatenated HMM‐using the HMM‐based speech synthesis technique. Finally, a speech waveform is synthesized by the MLSA filter using the generated mel‐cepstral coefficient and F0 sequences. In the previous system, we train an MSD‐VQ codebook for each phoneme for F0 quantization. Although this scheme can quantize F0 sequences efficiently, to achieve a better speech quality, larger codebook sizes are required. It leads to an increase in the bit rate of the system. To avoid this problem, we cluster F0 sequences using phonetic decision trees and then train a codebook for each leaf node. In the encoding and decoding, codebooks to be used can be determined by tracing the decision tree. It allows us to use smaller codebook sizes since the number of codebooks can be augmented without increase in bit rate. A subjective listening test result shows that the proposed scheme improves the quality of coded speech.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Decision‐tree‐based F0 quantization for hidden Markov model‐based speech coding at 100 bit/s

Abstract

Talk to us

Similar Papers

More From: The Journal of the Acoustical Society of America

Lead the way for us

Similar Papers

Digital Modulation Characteristics of High-Speed Semiconductor Laser for Use in Optical Communication Systems
Moustafa F Ahmed ... Fwoziah T Albelady
Arabian Journal for Science and Engineering | VOL. 39
Moustafa F Ahmed, et. al.Moustafa F Ahmed ... Fwoziah T Albelady
13 Apr 2014
Arabian Journal for Science and Engineering | VOL. 39

WDM soliton transmission in dispersion-managed links
Thierry Georges ... François Favre
-
Thierry Georges, et. al.Thierry Georges ... François Favre
01 Jan 1998
01 Jan 1998

Improving the performance of HMM-based very low bit rate speech coding
Takahiro Hoshiya ... Takashi Masuko
-
Takahiro Hoshiya, et. al. Takahiro Hoshiya ... Takashi Masuko
06 Apr 2003
06 Apr 2003

Intra Frame Coding In Advanced Video Coding Standard (H.264) to Obtain Consistent PSNR and Reduce Bit Rate for Diagonal Down Left Mode Using Gaussian Pulse
N Manjanaik ... B D Parameshachari
IOP Conference Series: Materials Science and Engineering | VOL. 225
N Manjanaik, et. al.N Manjanaik ... B D Parameshachari
01 Aug 2017
IOP Conference Series: Materials Science and Engineering | VOL. 225

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Decision‐tree‐based F0 quantization for hidden Markov model‐based speech coding at 100 bit/s

Abstract

Talk to us

Similar Papers

More From: The Journal of the Acoustical Society of America