Abstract

Speech recognition systems have been widely used and implemented in telephony systems, smartphones, security and home automation systems, where an individual's voice needs to be identified and recognised by the system in order to execute the next set of instructions. In this paper, we aim to develop a method to identify the voice and the speech of an individual using basic audio samples of isolated words through the use of cross correlation in MATLAB. It explores an extremely low computational speech recognition and speaker identification technique which does not rely on complex speech algorithms or trained models. Speaker identification and speech recognition are performed on a specified word instruction set in order to distinguish and analyse not only the words but also the speech pattern unique to an individual. After cleaning the audio signal to remove noise, the signal will be analysed using the property of cross-correlation, and other speech parameters such as Mel-Frequency Cepstral Coefficients (MFCCs) and pitch will be extracted. Results show an overall 92% accuracy for a set of 5 command words and 2 unique individuals using only 10 training samples. It is also found that audio signals of the same word recorded by the same person have a significantly greater degree of correlation than other audio signals and thus can be the basis for a valid speech recognition method for small scale applications where the processing capabilities of a system are low.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call