Abstract

The environment sound classification(ESC) is of great significance to the monitoring and control of urban noise. Aiming at the curse of dimensionality phenomenon in ESC, a feature dimensionality reduction architecture combining attention and mutual information is proposed. In order to match the two-dimensional MFCC (Mel Frequency Cepstral Coefficients) feature matrix, the proposed method separates and reconstructs the feature frames of different samples, and achieves the effect of dimensionality reduction by making decisions on the information entropy between the feature frames and labels. In addition, the method combines LSTM (Long Short-Term Memory) model with attention mechanism to ensure the recognition accuracy of the model after dimensionality reduction. Ten urban acoustic events from UrbanSound8k (US8K) dataset are selected to verify the performances of the proposed method by simulation experiments, which are also compared with the existing classification methods. The simulation results show that by combining the attention mechanism and mutual information, the recognition accuracy of the proposed method on the UrbanSound8k dataset is 95.16%, and the parameter scale is the smallest, only 0.92M. Moreover, the model parameter scale is adjustable by dynamic frame retention mechanism to balance the recognition accuracy and speed. This method not only ensures a high classification accuracy, but also can reduce computing power consumption and storage space of monitoring equipment, which shows a better practical performance for urban acoustic events recognition.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call