Abstract

This letter presents a novel confidence measure for the purpose of improving user performance in Spoken Document Retrieval (SDR). The proposed confidence measure is based on the phonetic distance between subword models, employing an anti-model which is determined to be discriminative to a target model using offline training data. As an advancement from our previous work, the proposed method employs separate phonetic similarity knowledge for vowels and consonants, resulting in more reliable performance over diverse SDR recorded speech conditions. A transcript reliability estimator is also presented, with evaluation as an application of the proposed confidence measure. Analysis on a variety of corpora including background noise, frequency band-restrictions, and a range of real-life conditions, shows that the proposed confidence measure is more reliable in detecting corrupted speech due to acoustic conditions or an unarticulated speaking style, providing a higher correlation to word error rate (WER). The proposed confidence measure is effective in increasing transcript reliability estimation performance with a 16.21% relative improvement.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.