Voice activity detection in the presence of transient based on graph

Xiao-Yuan Guo,Chun-Xian Gao,Hui Liu

doi:10.1186/s13636-023-00282-x

Xiao-Yuan Guo, Chun-Xian Gao + Show 1 more

Open Access

https://doi.org/10.1186/s13636-023-00282-x

Copy DOI

Abstract

Voice activity detection remains a significant challenge in the presence of transients since transients are more dominant than speech, though it has achieved satisfactory performance in quasi-stationary noisy environments. This paper studies the differences between speech and transients in nonlinear dynamic characteristics and proposes a new method for accurately detecting speech and transients. Limited by algorithm complexity, previous research has proposed few detectors to model speech and transients based on contextual information and thus failing to detect transient frames accurately. To address this challenge, our study proposes to map features of audio signals to a time series complex network, a kind of graph data, analyzed by the Laplacian and adjacency matrix of graphs, then classified by the support vector machine (SVM) classifier. The proposed algorithm can analyze a more extended speech period, allowing the full utilization of contextual information of preceding and following frames. The experimental results show that the performance of this method has obvious superiority over other existing algorithms.

Full Text