Abstract

AbstractGenerally, keyword spotting is based on the phonemic information. Consequently, if the input speech contains a phoneme sequence which is similar to the keyword, it may be incorrectly detected as the keyword, even though the prosodic pattern such as accent differs greatly. In order to avoid such a false alarm, this paper presents a method where not only the phonemic information, but also the F0 information, are both used to determine the likelihood of the input speech as the keyword. The F0 contour of the keyword is registered beforehand as the template. The F0 contour of the input speech and the template are compared by DP matching, and the likelihood as the keyword is evaluated by the obtained dissimilarity and the phonemic likelihood. Based on an experiment using news broadcasts, it is seen that the false alarm rate is reduced by 30 to 50%, for the equivalent detection rate. © 2001 Scripta Technica, Syst Comp Jpn, 32(7): 52–61, 2001

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.