Abstract

Analysis and detection of human voice at workplace such as telecommunications, military scenarios, medical scenarios, and law enforcement is important in assessing the ability of the worker and assigning tasks accordingly. This paper represents the results from a preliminary study to recognize the speech from human voice using mel-frequency cepstrum coefficients (MFCC) features. The 16 mel-scale warped cepstral coefficients were used independently for reorganization of speech from two Bangla commands of our native language. Cepstral coefficients for the utterance of ‘BATI JALAO’ (i.e., TURN ON LIGHT) and ‘PAKHA BONDHO KORO’ (i.e., TURN OFF FAN) from a particular speaker under preliminary investigation were used as features in a neural network. Network is trained using the MFCC features of two speakers in such a way that it can recognize only one particular person along with his command and terminate the program for other. Result of matching features in a neural network demonstrates that MFCC features work significantly to recognize speech.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.