In this research, we propose a variant of the Bare-Bones Particle Swarm Optimization (BBPSO) algorithm for hyper-parameter selection and deep architecture generation for image, audio and video classification tasks. Since the search process of the original BBPSO model is guided by a single leader and the particles’ personal best experiences, there is a lack of interactions pertaining to the neighbouring elite solutions. To overcome this limitation, we propose a versatile search process for a modified BBPSO model that incorporates a number of effective components and operations. These include the neighbouring and global best signals, search actions with Cauchy/Levy scale factors, sub-dimension operations guided by the local and global elite solutions, and a Levy-driven local search mechanism. Moreover, root-finding algorithms are employed which use informative mathematical principles to estimate new root offspring for leader/particle enhancement. A reinforcement learning algorithm is subsequently used to identify the optimal sequential deployment of these numerical analysis methods to increase robustness. Several medical imaging data sets, i.e., ISIC 2017, PH2 and Dermofit skin lesion databases, the ALL-IDB2 microscopic blood image data set, the MURA musculoskeletal radiographic database, the CK + facial expression data set, as well as the Coswara respiratory audio data set and UCF101 video action data set, are employed for evaluation. The proposed BBPSO-optimized Convolutional Neural Network (CNN), bidirectional Long Short-Term Memory (BiLSTM) with attention mechanism, and CNN-BiLSTM models outperform those devised by other PSO and BBPSO variants, as well as state-of-the-art existing studies, significantly, for image, audio respiratory abnormality and realistic video action recognition.
Read full abstract