Biclustering analysis of gene expression data can reveal a large number of biologically significant local gene expression patterns. Therefore, a large number of biclustering algorithms apply meta-heuristic algorithms such as genetic algorithm (GA) and cuckoo search (CS) to analyze the biclusters. However, different meta-heuristic algorithms have different applicability and characteristics. For example, the CS algorithm can obtain high-quality bicluster and strong global search ability, but its local search ability is relatively poor. In contrast to the CS algorithm, the GA has strong local search ability, but its global search ability is poor. In order to not only improve the global search ability of a bicluster and its coverage, but also improve the local search ability of the bicluster and its quality, this paper proposed a meta-heuristic algorithm based on GA and CS algorithm (GA-CS Biclustering, Georgia Association of Community Service Boards (GACSB)) to solve the problem of gene expression data clustering. The algorithm uses the CS algorithm as the main framework, and uses the tournament strategy and the elite retention strategy based on the GA to generate the next generation of the population. Compared with the experimental results of common biclustering analysis algorithms such as correlated correspondence (CC), fast, local clustering (FLOC), interior search algorithm (ISA), Securities Exchange Board of India (SEBI), sum of squares between (SSB) and coordinated scheduling/beamforming (CSB), the GACSB algorithm can not only obtain biclusters of high quality, but also obtain biclusters of high-biologic significance. In addition, we also use different bicluster evaluation indicators, such as Average Correlation Value (ACV), Mean-Squared Residue (MSR) and Virtual Error (VE), and verify that the GACSB algorithm has a strong scalability.
Read full abstract