A speech recognition approach for an industrial training station

Valentin-Cătălin Govoreanu,Adrian-Nicolae Ţocu,Dragoş Circa,Alin-Marius Cruceat

doi:10.1051/matecconf/202134304003

Valentin-Cătălin Govoreanu, Adrian-Nicolae Ţocu + Show 2 more

Open Access

https://doi.org/10.1051/matecconf/202134304003

Copy DOI

Abstract

This paper presents a speech recognition service used in the context of commanding and guiding the activities around an industrial training station. The entire concept is built on a decentralized microservice architecture and one of the many hardware and software components is the speech recognition engine. This engine grants users the possibility to interact seamlessly with other components in order to ensure a gradual and productive learning process. By working with different API’s for both English and Romanian languages, the presented approach manages to obtain good speech recognition for defining task phrases aiding the training procedure and to reduce the recognition required time by almost 50%.

Highlights

Speech recognition (SR) or sometimes referred to as Automatic Speech Recognition (ASR) is one of the most used such forms of alternative interaction methods, others are by using physical controls such as buttons
The speech recognition system has been designed by using the C# language, which was used for integrating the Google Speech-to-Text API [1]
Intelligent manufacturing systems are playing a significant role in this new promising industry

Summary

Introduction

Speech recognition (SR) or sometimes referred to as Automatic Speech Recognition (ASR) is one of the most used such forms of alternative interaction methods, others are by using physical controls such as buttons. Speech is the most natural way for humans to interact with each other and an appealing approach for introducing the uninitiated users to interact with machines. It is an efficient way to interact with computers, especially when the users have their hands occupied while doing a task. The goal was to extend our training station capabilities by creating a recognition system for continuous speech that could understand a predefined set of commands, such as “Open first application”, “I need help” or “Stop the training”, and execute specific instructions. The system would be able to understand both the Romanian and the English language, and it would return the gender of the user based on speech

Objectives

Methods

Results

Conclusion