Biclustering of Expression Data Using Simulated Annealing

K Bryan,P Cunningham,N Bolshakova

doi:10.1109/cbms.2005.37

Abstract

In a gene expression data matrix a bicluster is a grouping of a subset of genes and a subset of conditions which show correlating levels of expression activity. The difficulty of finding significant biclusters in gene expression data grows exponentially with the size of the dataset and heuristic approaches such as Cheng and Church's greedy node deletion algorithm are required. It is to be expected that stochastic search techniques such as genetic algorithms or simulated annealing might produce better solutions than greedy search. In this paper we show that a simulated annealing approach is well suited to this problem and we present a comparative evaluation of simulated annealing and node deletion on a variety of datasets. We show that simulated annealing discovers more significant biclusters in many cases.

Full Text