Comparative study of feature extraction methods for direct word discovery with NPB-DAA from natural speech signals

Yuki Tada,Tadahiro Taniguchi,Yoshinobu Hagiwara

doi:10.1109/devlrn.2017.8329802

Abstract

Human infants can discover words directly from unsegmented speech signals given by their mothers and other people without any explicitly labeled data. Developing a computational model and a machine learning method that enable an artificial system to acquire words and phonemes from speech signals automatically is an important challenge. It also provides a hypothesis that can explain the dynamic process performed by infants, i.e., word discovery and phoneme acquisition from daily experiences. The nonparametric Bayesian double articulation analyzer (NPB-DAA) is an unsupervised machine learning method that can automatically discover word-like and phoneme-like units from speech signals directly. However, its performance has only not been evaluated using natural spoken languages including consonants. For dealing with natural speech signals including consonants, a comparative study of the methods for extracting features from speech signals is crucially important. This paper provides a comparative study of feature extraction methods for direct word discovery with NPB-DAA from natural speech signals. We examined six types of feature extraction methods employing a mel-frequency cepstral coefficient and a deep sparse autoencoder (DSAE) with several types of employment of dynamic features on the TIDIGITS corpus, which contains utterances of connected digit sequences. The results showed that 1) NPB-DAA with/without DSAE can extract words and phonemes from natural speech signals containing consonants to a certain extent, 2) naive introduction of dynamics features can even harm the performance of word discovery, and 3) DSAE can consistently increase the correlation between the log-likelihood and the performance measure of word discovery.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Comparative study of feature extraction methods for direct word discovery with NPB-DAA from natural speech signals

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Unsupervised learning for spoken word production based on simultaneous word and phoneme discovery without transcribed data
Yuusuke Miyuki ... Tadahiro Taniguchi
-
Yuusuke Miyuki, et. al.Yuusuke Miyuki ... Tadahiro Taniguchi
01 Sep 2017
01 Sep 2017

Unsupervised Phoneme and Word Discovery From Multiple Speakers Using Double Articulation Analyzer and Neural Network With Parametric Bias.
Ryo Nakashima ... Tadahiro Taniguchi
Frontiers in robotics and AI | VOL. 6
Ryo Nakashima, et. al.Ryo Nakashima ... Tadahiro Taniguchi
01 Oct 2019
Frontiers in robotics and AI | VOL. 6

Double articulation analyzer with deep sparse autoencoder for unsupervised word discovery from speech signals
Tadahiro Taniguchi ... Shogo Nagasaka
Advanced Robotics | VOL. 30
Tadahiro Taniguchi, et. al.Tadahiro Taniguchi ... Shogo Nagasaka
08 Apr 2016
Advanced Robotics | VOL. 30

Accelerated Nonparametric Bayesian Double Articulation Analyzer for Unsupervised Word Discovery
Ryo Ozaki ... Tadahiro Taniguchi
-
Ryo Ozaki, et. al.Ryo Ozaki ... Tadahiro Taniguchi
01 Sep 2018
01 Sep 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Comparative study of feature extraction methods for direct word discovery with NPB-DAA from natural speech signals

Abstract

Talk to us

Similar Papers