Abstract

The accuracy of speech processing applications degrades when operating in co-channel environment. Co-channel speech occurs when more than one person is talking at the same time. The idea of usable speech segmentation is to identify and extract those portions of co-channel speech that are minimally degraded but still useful for speech processing applications (such as speaker identification or speech recognition) which do not work in co-channel environments. Usable speech measures are features that are extracted from the co-channel signal to distinguish between usable and unusable speech. Several usable speech extraction methods have recently been developed based on a single feature of the speech signal being considered. In this paper, however, a new usable speech extraction technique, which sequentially and contextually selects several features of the given signal using the K-nearest neighbor classifier, is being investigated. This new approach considers periodicity and structure based features simultaneously in order to achieve the maximum classification rate, and by observing all the incoming frames, avoids the problem of deciding the amount of data needed to make accurate decisions. A 100% accuracy can be achieved in speech processing applications by using this extracted usable speech segment.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.