Detection of vowel onset and offset points using non‐local similarity between DWT approximation coefficients

A Kumar,G Pradhan

doi:10.1049/el.2018.0629

Abstract

In a given speech signal, transition at the vowel offset points (VEPs) is quite different when compared to the vowel onset points (VOPs). Consequently, most of the features reported for the detection of VOPs fail to detect VEPs. To address this issue, a front-end speech parametrisation approach is proposed for simultaneously detecting VOPs and VEPs. In the proposed approach, first energy due to the high-frequency unvoiced sound units is suppressed by using discrete wavelet transform (DWT). Then, weight values (WVs) are assigned to each of the sample points by computing similarity between the analysis frames within a search neighbourhood using non-local means (NLM) estimation. The WVs computed from the NLM are significantly less when the frames under consideration are similar in comparison to the dissimilar ones. Since vowels are longer regions and exhibit periodicity, there will be more similarity in the case of frames belonging to these regions. In this work, this aspect of the WVs is used as a feature for detecting VOPs and VEPs. The proposed feature is observed to outperform the earlier reported features for the task of detecting VOPs as well as VEPs. The improvement observed in the detection accuracy of VEPs is significant.

Full Text