Abstract

AbstractA method has been proposed in the spoken word recognition, which utilizes the frequency‐time pattern inherent to the word. However, in the application of the method for speech by unspecified speakers, normalization is required, in contrast to the case of specified speakers, since the formant in the pattern may be shifted along the frequency‐axis due to the individual difference of harmonization organs of the speakers. The method including the formant normalization or nonlinear conversion of the frequency‐axis contains a larger computational complexity, and is not suited to a large vocabulary of words. From such a viewpoint, this paper proposes a method, in which a common weighted average dictionary is constructed by superposing the binary local‐peak patterns of speakers, absorbing their individual differences into a dictionary. The individual differences of speakers are also overcome by matching with the dictionary the binary pattern with an allowable width along the frequency‐axis centered around the local peak of the input speech. The weighted average dictionary is only a pattern set representing the frequency of the local peaks inherent in the word, and the storage of data requires less memory capacity. Since the input pattern is matched based on the binary pattern set, the computational complexity is also reduced. With 110 words of 0A command as the sample speech, both male and female speech is used as the inputs. When the input speakers are restricted, a recognition rate of 97 percent was obtained for restricted speakers and nearly 92 percent was obtained for unrestricted speakers.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.