Abstract
Speech recognition systems have been widely used and implemented in telephony systems, smartphones, security and home automation systems, where an individual's voice needs to be identified and recognised by the system in order to execute the next set of instructions. In this paper, we aim to develop a method to identify the voice and the speech of an individual using basic audio samples of isolated words through the use of cross correlation in MATLAB. It explores an extremely low computational speech recognition and speaker identification technique which does not rely on complex speech algorithms or trained models. Speaker identification and speech recognition are performed on a specified word instruction set in order to distinguish and analyse not only the words but also the speech pattern unique to an individual. After cleaning the audio signal to remove noise, the signal will be analysed using the property of cross-correlation, and other speech parameters such as Mel-Frequency Cepstral Coefficients (MFCCs) and pitch will be extracted. Results show an overall 92% accuracy for a set of 5 command words and 2 unique individuals using only 10 training samples. It is also found that audio signals of the same word recorded by the same person have a significantly greater degree of correlation than other audio signals and thus can be the basis for a valid speech recognition method for small scale applications where the processing capabilities of a system are low.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.