Abstract

Network-based systems biology has become an important method for analyzing high-throughput gene expression data and gene function mining. Escherichia coli (E. coli) has long been a popular model organism for basic biological research. In this paper, weighted gene co-expression network analysis (WGCNA) algorithm was applied to construct gene co-expression networks in E. coli. Thirty-one gene co-expression modules were detected from 1391 microarrays of E. coli data. Further characterization of these modules with the database for annotation, visualization, and integrated discovery (DAVID) tool showed that these modules are associated with several kinds of biological processes, such as carbohydrate catabolism, fatty acid metabolism, amino acid metabolism, transportation, translation, and ncRNA metabolism. Hub genes were also screened by intra-modular connectivity. Genes with unknown functions were annotated by guilt-by-association. Comparison with a previous prediction tool, EcoliNet, suggests that our dataset can expand gene predictions. In summary, 31 functional modules were identified in E. coli, 24 of which were functionally annotated. The analysis provides a resource for future gene discovery.

Highlights

  • Escherichia coli (E. coli) is an abundant bacteria in the intestine of humans and animals

  • 1761661_s_at, which annotated as gene an intergenic highly connected probe pairs were visualized by Cytoscape

  • weighted gene co-expression network analysis (WGCNA) has been extensively applied for gene co-expression network construction in many species

Read more

Summary

Introduction

Escherichia coli (E. coli) is an abundant bacteria in the intestine of humans and animals. It is a single-celled prokaryote that is widely used as a model organism in biology research. The E. coli genome size is about 4.64 M and encodes 5416 genes. Well-developed microarray technology has been used to investigate genome-wide gene expression thanks to its low cost. There is extensive E. coli transcriptome data deposited in the public databases. These data include gene expression data under various conditions, such as different nutrients, growing stages, and gene mutations [1]. Scientists have been endeavoring to mine transcriptional regulation networks, gene expression networks, and protein–protein interaction networks by mathematical models [2].

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call