Abstract

A critical step in the analysis of large genome-wide gene expression datasets is the use of module detection methods to group genes into co-expression modules. Because of limitations of classical clustering methods, numerous alternative module detection methods have been proposed, which improve upon clustering by handling co-expression in only a subset of samples, modelling the regulatory network, and/or allowing overlap between modules. In this study we use known regulatory networks to do a comprehensive and robust evaluation of these different methods. Overall, decomposition methods outperform all other strategies, while we do not find a clear advantage of biclustering and network inference-based approaches on large gene expression datasets. Using our evaluation workflow, we also investigate several practical aspects of module detection, such as parameter estimation and the use of alternative similarity measures, and conclude with recommendations for the further development of these methods.

Highlights

  • A critical step in the analysis of large genome-wide gene expression datasets is the use of module detection methods to group genes into co-expression modules

  • Decomposition methods[16] and biclustering[17] try to handle local coexpression and overlap. These methods differ from clustering because they allow that genes within a module do not need to be co-expressed in all biological samples, but that a sample can influence the expression of a module to a certain degree or not at all

  • Our results indicate that decomposition methods detect the modules which best correspond to the known modular structure within the gene regulatory network (Fig. 2a)

Read more

Summary

Results

Our evaluation procedure was structured as follows (Fig. 1). We applied publicly available module detection methods on nine gene expression compendia from Escherichia coli, MERLIN Genomica GENIE3 CLR Correlation TIGRESS Permuted modules Sticky network Scale-free network. FLAME k-medoids k-medoids* Fuzzy c-means SOM k-means MCL Spectral 1 Affinity propgation Spectral 2 Transitivity WGCNA Agglomerative Hybrid Divisive Agglom.* SOTA Dclust CLICK DBSCAN

A BCD E F GH I J K LMNOPQR STUVA BCD E A BCD E F GH I A B A BCD b
A Iterative NI 2
Discussion
Methods
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call