A New Speech Recognition Model in a Human-Robot Interaction Scenario Using NAO Robot: Proposal and Preliminary Model

Hussain A. Younis,Sani Salisu,M. N. Ab Wahab,A.S.A. Mohamed,R. Jamaludin

doi:10.1109/icict52195.2021.9568457

Abstract

There are several terms for speech recognition. Auto speech recognition (ASR), speech-to-text, and computer speech recognition are all terms used to describe speech recognition. A single user's voice it is necessary to distinguish between speech recognition and voice recognition. The first is to translate speech into text, such as, the audible voice and concept (human speech), and the second is to define only sound, such as, animal sound, car, etc. There is no algorithm that is specifically designed for this field; instead, techniques such as N-grams and neural networks are used to explain and treat this type. Natural Language Processing (NLP), Hidden Markov Model (HMM), and Speaker Diarization (SD). The last type would be addressed in my work. Natural language processing is a computational technique that can be used and applied to various levels of linguistic analysis (dare, deep analysis) to represent natural language in a useful or more representation. It is still possible to improve current recognition and identification systems in order to achieve greater accuracy. A new approach has been proposed that distinguishes speech in four stages: speech recognition, tokenization, extracting features of speech from texts, and part speech: The three patterns of Name Entity Recognition (NER), followed by the possibility of implementing the proposed model It achieved more accurate and applied results in an educational environment by using a NAO-robot.

Full Text