Abstract

The increasing use of microarray technologies is generating a large amount of data that must be processed to extract underlying gene expression patterns. Existing clustering methods could suffer from certain drawbacks. Most methods cannot automatically separate scattered, singleton and mini-cluster genes from other genes. Inclusion of these types of genes into regular clustering processes can impede identification of gene expression patterns. In this paper, we propose a general clustering method, namely a dynamic agglomerative clustering (DAC) method. DAC can automatically separate scattered, singleton and mini-cluster genes from other genes and thus avoid possible contamination to the gene expression patterns caused by them. For DAC, the scattered gene filtering step is no longer necessary in data pre-processing. In addition, we propose a criterion for evaluating clustering results for a dataset which contains scattered, singleton and/or mini-cluster genes. DAC has been applied successfully to two real datasets for identification of gene expression patterns. Our numerical results indicate that DAC outperforms other clustering methods, such as the quality-based and model-based clustering methods, in clustering datasets which contain scattered, singleton and/or mini-cluster genes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call