Unsupervised mining of under-resourced speech corpora for tone features classification

Moses E Ekpenyong,Udoinyang G Inyang

doi:10.1109/ijcnn.2016.7727494

Abstract

In this contribution, the unsupervised mining of speech corpora for the efficient classification of tone features was investigated. Input vectors to the experiment were generated from tone pattern alignments of Ibibio (Benue-Congo, Nigeria) corpus. The corpus used for the experiment contained 16,905 words/phrases. The proposed system design is novel, and integrates two unsupervised tools - k-means clustering and self organizing map (SOM) model, into a methodological workflow, that evaluates and selects the optimal number of clusters with the subsequent association of each clustering point to the input data points. In order to reduce data dimensionality for effective visualization, a non-negative matrix factorization (NMF) was introduced to rid the k-means clusters of noisy attributes. The k-means cluster points generated by the optimum clusters (two in this case) were evaluated by the Silhouette algorithm and finally fed into the SOM, to improve the efficiency of features classification. Results obtained validate existing research claims and demonstrates the importance of vowel-only features in the recognition of tone patterns. A SOM visualization of the input vectors revealed that vowel-only feature correlates better with other input vectors such as syllable and phoneme, compared to consonant-only features. Furthermore, clustering the input datasets into the optimal number of clusters enabled proper and timely visualization of the map. This contribution is therefore vital for advancing future speech processing research on under-resourced languages.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Unsupervised mining of under-resourced speech corpora for tone features classification

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Development and validation of consensus clustering-based framework for brain segmentation using resting fMRI.
Srikanth Ryali ... Weidong Cai
Journal of neuroscience methods | VOL. 240
Srikanth Ryali, et. al.Srikanth Ryali ... Weidong Cai
29 Nov 2014
Journal of neuroscience methods | VOL. 240

Perbandingan Metode K-Means Clustering dengan Self-Organizing Maps (SOM) untuk Pengelompokan Provinsi di Indonesia Berdasarkan Data Potensi Desa
Lisa Rianti Iyohu ... La Ode Nashar
Jurnal Statistika dan Aplikasinya | VOL. 7
Lisa Rianti Iyohu, et. al.Lisa Rianti Iyohu ... La Ode Nashar
31 Dec 2024
Jurnal Statistika dan Aplikasinya | VOL. 7

Improvements Quality of Kohonen Maps Using Dimension Reduction Methods
Jiri Dvorsky ... Jana Kocibov
-
Jiri Dvorsky, et. al.Jiri Dvorsky ... Jana Kocibov
01 Apr 2010
01 Apr 2010

K-Means Cluster for Seismicity Partitioning and Geological Structure Interpretation, with Application to the Yongshaba Mine (China)
Xueyi Shang ... Xibing Li
Shock and Vibration | VOL. 2017
Xueyi Shang, et. al.Xueyi Shang ... Xibing Li
01 Jan 2017
Shock and Vibration | VOL. 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Unsupervised mining of under-resourced speech corpora for tone features classification

Abstract

Talk to us

Similar Papers