Audio classification and retrieval has been recognized as a fascinating field of endeavor for as long as it has existed due to the topic of identifying and choosing the most useful audio attributes. The categorization of audio files is significant not only in the area of multimedia applications but also in the disciplines of medicine, sound analysis, intelligent homes and cities, urban informatics, entertainment, and surveillance. This study introduces a new algorithm called the modified bacterial foraging optimization algorithm (MBFOA), which is based on a method that retrieves and classifies audio data. The goal of this algorithm is to reduce the computational complexity of existing techniques. Along with the combination of the peak estimated signal, the enhanced mel-frequency cepstral coefficient (EMFCC) and the enhanced power normalized cepstral coefficients (EPNCC) are used. These are then optimized using the fitness function utilizing MBFOA. The probabilistic neural network is used to differentiate between a music signal and a spoken signal from an audio source (PNN). It is next necessary to extract and list the characteristics that correspond to the class that was arrived at as a consequence of the categorization. When compared to other approaches that are somewhat similar, MBFOA demonstrates superior levels of sensitivity, specificity, and accuracy.
Read full abstract