CNN-LSTM for MFCC-based Speech Recognition on Smart Mirrors for Edge Computing Command

Aji Gautama Putrada,Ikke Dian Oktaviani,Mohamad Nurkamal Fauzan,Nur Alamsyah

doi:10.20895/dinda.v4i2.1504

Abstract

Smart mirrors are conventional mirrors that are augmented with embedded system capabilities to provide comfort and sophistication for users, including introducing the speech command function. However, existing research still applies the Google Speech API, which utilizes the cloud and provides sub-optimal processing time. Our research aim is to design speech recognition using Mel-frequency cepstral coefficients (MFCC) and convolutional neural network–long short-term memory (CNN-LSTM) to be applied to smart mirror edge devices for optimum processing time. Our first step was to download a synthetic speech recognition dataset consisting of waveform audio files (WAVs) from Kaggle, which included the utterances “left,” “right,” “yes,” “no,” “on,” and “off. ” We then designed speech recognition by involving Fourier transformation and low-pass filtering. We benchmark MFCC with linear predictive coding (LPC) because both are feature extraction methods on speech datasets. Then, we benchmarked CNN-LSTM with LSTM, simple recurrent neural network (RNN), and gated recurrent unit (GRU). Finally, we designed a smart mirror system complete with GUI and functions. The test results show that CNN-LSTM performs better than the three other methods with accuracy, precision, recall, and an f1-score of 0.92. The speech command with the best precision is "no," with a value of 0.940. Meanwhile, the command with the best recall is "off," with a value of 0.963. On the other hand, the speech command with the worst precision and recall is "other," with a value of 0.839. The contribution of this research is a smart mirror whose speech commands are carried out on the edge device with CNN-LSTM.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

CNN-LSTM for MFCC-based Speech Recognition on Smart Mirrors for Edge Computing Command

Abstract

Talk to us

Similar Papers

More From: Journal of Dinda : Data Science, Information Technology, and Data Analytics

Lead the way for us

Journal: Journal of Dinda : Data Science, Information Technology, and Data Analytics	Publication Date: Aug 1, 2024
License type: CC BY-SA 4.0

Similar Papers

Recognition of Musical Instrument Using Deep Learning Techniques
Sangeetha Rajesh ... Nalini N J
International Journal of Information Retrieval Research | VOL. 11
Sangeetha Rajesh, et. al.Sangeetha Rajesh ... Nalini N J
01 Oct 2021
International Journal of Information Retrieval Research | VOL. 11

Air quality index forecast in Beijing based on CNN-LSTM multi-model
Jiaxuan Zhang ... Shunyong Li
Chemosphere | VOL. 308
Jiaxuan Zhang, et. al.Jiaxuan Zhang ... Shunyong Li
01 Sep 2022
Chemosphere | VOL. 308

Recognizing Command Words using Deep Recurrent Neural Network for Both Acoustic and Throat Speech
Sadi M Redwan ... Md Rashed-Al-Mahfuz
European Journal of Information Technologies and Computer Science | VOL. 3
Sadi M Redwan, et. al.Sadi M Redwan ... Md Rashed-Al-Mahfuz
22 May 2023
European Journal of Information Technologies and Computer Science | VOL. 3

Effects of Noise on RASTA-PLP and MFCC based Bangla ASR Using CNN
Md Raffael Maruf ... Nazmun Nahar Nelima
-
Md Raffael Maruf, et. al.Md Raffael Maruf ... Nazmun Nahar Nelima
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

CNN-LSTM for MFCC-based Speech Recognition on Smart Mirrors for Edge Computing Command

Abstract

Talk to us

Similar Papers

More From: Journal of Dinda : Data Science, Information Technology, and Data Analytics