One of the upcoming areas of research is the automated environmental sound classification, as its application is highly utilized in criminal investigations scenarios, surveillance systems, biomedical fields, radio navigation systems, etc. The procedure of environmental sound classification deals with standardized techniques such as feature extraction and feature selection, followed by classification with machine learning or deep learning techniques. In this work, five feature extraction techniques are used initially such as the discrete wavelet transform (DWT)-based empirical mode decomposition (EMD) technique, phase space reconstruction (PSR), kernel principal component analysis (KPCA), variational mode decomposition (VMD) and tunable Q-wavelet transform (TQWT). The extracted features are then selected through seven feature selection techniques, out of which two have been proposed newly and five have already existed. The proposed feature selection techniques are stratified clustered collective technique (SCCT) and fuzzy-based template shape clustering (FTSC). The other feature selection techniques used are the generalized discriminant analysis (GDA), ReliefF, Fisher discriminant criterion (FDC), Kruskal–Wallis test and coati optimization algorithm (COA). The selected features are then classified through the proposed swarm intelligence-based hybrid Adaboost – random forest termed as SIHAR classifier and the selected features are classified with the less explored sparse representation classifier (SRC) along with eight other traditionally used machine learning classifiers. The work also proposes graph-dependent modeling for the environmental sounds where the rhythms are extracted for the similarity assessment and later the decision-making is done using hypothesis testing. The work is tested on Firat ESC-50 dataset and the best results are produced in terms of a high classification accuracy of 87.48% which is obtained for the TQWT + SCCT + SIHAR combination.
Read full abstract