Abstract

In this research, a method was developed for utilizing voice commands with programmable logic controllers (PLCs) and supervisory control and data acquisition (SCADA) systems, which are commonly utilized in industrial automation. This approach incorporates artificial intelligence to enable human–machine interaction, aligning with the trends of Industry 4.0. A deep neural network was specifically designed for speech recognition, eliminating the need for reliance on any pre-existing speech-to-text engines. The objective was to create a model that is accurate and compact in size, making it suitable for embedded systems within industrial systems. To train the deep learning network, 21,600 sound files were generated. These files combined real factory noise with a synthetic dataset of human speech, forming a dataset comprising 60 different classes of voice commands. These commands encompassed actions like starting, stopping, and operating at various speeds for 10 motors controlled by the automation system. After applying the Mel-frequency cepstral coefficient (MFCC) to the voice commands, the resulting data was directly fed into the proposed network. The network achieved an impressive accuracy rate of 99.73%. Notably, the proposed network outperformed even networks several times its size.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call