A Hybrid Spoken Language Processing System for Smart Device Troubleshooting

Praveen Edward James,Hou Kit Mun,Chockalingam Aravind Vaithilingam

doi:10.3390/electronics8060681

Abstract

The purpose of this work is to develop a spoken language processing system for smart device troubleshooting using human-machine interaction. This system combines a software Bidirectional Long Short Term Memory Cell (BLSTM)-based speech recognizer and a hardware LSTM-based language processor for Natural Language Processing (NLP) using the serial RS232 interface. Mel Frequency Cepstral Coefficient (MFCC)-based feature vectors from the speech signal are directly input into a BLSTM network. A dropout layer is added to the BLSTM layer to reduce over-fitting and improve robustness. The speech recognition component is a combination of an acoustic modeler, pronunciation dictionary, and a BLSTM network for generating query text, and executes in real time with an 81.5% Word Error Rate (WER) and average training time of 45 s. The language processor comprises a vectorizer, lookup dictionary, key encoder, Long Short Term Memory Cell (LSTM)-based training and prediction network, and dialogue manager, and transforms query intent to generate response text with a processing time of 0.59 s, 5% hardware utilization, and an F1 score of 95.2%. The proposed system has a 4.17% decrease in accuracy compared with existing systems. The existing systems use parallel processing and high-speed cache memories to perform additional training, which improves the accuracy. However, the performance of the language processor has a 36.7% decrease in processing time and 50% decrease in hardware utilization, making it suitable for troubleshooting smart devices.

Highlights

Manipulating speech signals to extract relevant information is known as speech processing [1].This work integrates an optimized realization of speech recognition with Natural Language Processing (NLP) and a Text to Speech (TTS) system to perform Spoken Language Processing (SLP) using a hybrid software-hardware design approach
These results indicate the improvement in accuracy by implementing the Bidirectional Long Short Term Memory Cell (BLSTM)-based speech recognition system
The results reveal that the performance of the language processor is better in terms of the F1 score and processing time

Summary

Introduction

This work integrates an optimized realization of speech recognition with Natural Language Processing (NLP) and a Text to Speech (TTS) system to perform Spoken Language Processing (SLP) using a hybrid software-hardware design approach. SLP involves three major tasks, namely translating speech to text (speech recognition), capturing the intent of the text, action determination using data processing techniques (NLP), and responding to users through voice (Speech Synthesis). Memory cell (LSTM), a class of Recurrent Neural Networks (RNN), is currently the state-of-the-art for continuous word speech recognition and NLP, due to its ability to process sequential data [2]. There are several LSTM-based speech recognition techniques available in the literature. For end-to-end speech recognition, speech spectrograms are chosen directly as the pre-processing scheme and processed by a deep bidirectional LSTM network with a novel Connectionist Temporal

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Electronics	Publication Date: Jun 16, 2019
Citations: 6	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Hybrid Spoken Language Processing System for Smart Device Troubleshooting

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics

Lead the way for us

Similar Papers

Natural Language Processing and Computational Linguistics
Junichi Tsujii
Computational Linguistics | VOL. -
Junichi TsujiiJunichi Tsujii
07 Dec 2021
Computational Linguistics | VOL. -

Analysis of Dialectal Influence in Pan-Arabic ASR
...
-
, et. al. ...
01 Aug 2011
01 Aug 2011

Mlphon: A Multifunctional Grapheme-Phoneme Conversion Tool Using Finite State Transducers
Kavya Manohar ... A R Jayan
IEEE Access | VOL. 10
Kavya Manohar, et. al.Kavya Manohar ... A R Jayan
01 Jan 2021
IEEE Access | VOL. 10

Training data pseudo-shuffling and direct decoding framework for recurrent neural network based acoustic modeling
Naoyuki Kanda ... Mitsuyoshi Tachimori
-
Naoyuki Kanda, et. al.Naoyuki Kanda ... Mitsuyoshi Tachimori
01 Dec 2015
01 Dec 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Hybrid Spoken Language Processing System for Smart Device Troubleshooting

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics