Combining Fuzzy Clustering and Hidden Markov Models for Sundanese Speech Recognition

Intan Nurma Yulita,Akik Hidayat,Atje Setiawan Abdullah,Erick Paulus

doi:10.1088/1742-6596/1028/1/012239

Intan Nurma Yulita, Akik Hidayat + Show 2 more

Open Access

https://doi.org/10.1088/1742-6596/1028/1/012239

Copy DOI

Abstract

Sundanese tribe is one of the largest population tribe in Indonesia. However, over time, users of the Sundanese language are declining because of the living languages outside of Sundanese. One way to preserve Sundanese is Sundanese Speech Recognition. In this research, several processes of recognition were done include pre-processing, feature extraction, Fuzzy Clustering, and Hidden Markov Models. Pre-processing aims to separate the recording from the noise and normalize the speech signal, while the feature extraction to obtain the characteristics of the speech signal to distinguish each phoneme from the speech. In particular, the contribution of this research is to combine Fuzzy Clustering and Hidden Markov Models for Sundanese Speech Recognition. Fuzzy Clustering plays a role in finding unique symbols in the speech signal. These symbols are represented as centroid in fuzzy clustering. The next process, each segment of the speech signal calculated the probability of the membership for all centroids. The output of this calculation becomes input to Hidden Markov Models. The test uses a speech corpus derived from 30 people. The results obtained that the combination of Fuzzy Clustering and Hidden Markov Models have a better performance than Hidden Markov Models. Also, the research also analyses the optimal number of clusters of Fuzzy Clustering and states of Hidden Markov Models for the datasets used.

Full Text