Abstract

Environmental sounds are being used widely in our lives. It is especially used in tasks such as managing smart cities, location determination, surveillance systems, machine hearing, and environmental monitoring. The main method for this, environmental sound classification (ESC), has been increasingly studied in recent years. However, the classification of these sounds is more difficult than other sounds because there are too many parameters that generate noise. The study tried to find the convolutional neural network (CNN) model that gave the highest accuracy for ESC tasks with the optimization of hyper-parameters. For this purpose, the Particle Swarm Optimization (PSO) algorithm was rearranged to represent the CNN architecture. Thus, the hyper-parameters in CNN are represented exactly without any transformation during optimization. Studies were carried out on the ESC-10, ESC-50, and Urbansound8k data sets, which are state-of-art for ESC tasks. Some data augmentation techniques have been used for data sets in the training of CNN models. The CNN models, which were obtained with PSO, achieved success rates of 98.64 % for ESC-10, 93.71 % for ESC-50, and 98.45 % for Urbansound8k, respectively. These results are the best accuracy values obtained with the pure CNN model when compared with previous studies. As a result, it has been made possible to automatically design CNN models for the classification of urban sounds, giving high classification accuracy. Thus, researchers who do not know much about CNN design can use this method in their desired datasets without the need for expert knowledge.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call