Abstract

Biological systems respond to environmental perturbations and to a large diversity of compounds through gene interactions, and these genetic factors comprise complex networks. In particular, a wide variety of gene co-expression networks have been constructed in recent years thanks to the dramatic increase of experimental information obtained with techniques, such as microarrays and RNA sequencing. These networks allow the identification of groups of co-expressed genes that can function in the same process and, in turn, these networks may be related to biological functions of industrial, medical and academic interest. In this study, gene co-expression networks for 17 bacterial organisms from the COLOMBOS database were analyzed via weighted gene co-expression network analysis and clustered into modules of genes with similar expression patterns for each species. These networks were analyzed to determine relevant modules through a hypergeometric approach based on a set of transcription factors and enzymes for each genome. The richest modules were characterized using PFAM families and KEGG metabolic maps. Additionally, we conducted a Gene Ontology analysis for enrichment of biological functions. Finally, we identified modules that shared similarity through all the studied organisms by using comparative genomics.

Highlights

  • Organisms are dynamic systems that respond to intracellular and extracellular signals through the regulated expression of their genes

  • In order to determine which genes share similar co-expression patterns in bacteria, a set of co-expression networks was inferred for 17 different bacteria with Weighted Gene Coexpression Network Analysis (WGCNA) R package (Largfelder and Holvarth, 2008), based on the information deposited in the COLOMBOS database (Moretto et al, 2016)

  • In the case of the dataset used in our study, the number of samples did not reflect the number of Gene Expression Omnibus (GEO) series used for each bacterium, and this would have influenced the number of modules identified for each organism, as in the case of Bacillus anthracis strain Ames (Ban), for which the samples belonged to 4 GEO series, or Helicobacter pylori 26695 (Hpy), for which the samples belonged to 8 GEO series, while Salmonella enterica LT2 (Stm) samples came from 16 GEO series

Read more

Summary

Introduction

Organisms are dynamic systems that respond to intracellular and extracellular signals through the regulated expression of their genes. Recent approaches have shown that there are underlying properties that can only be explained by studying organisms as complex systems (Kitano, 2002; Trewavas, 2006) In this context, a systematic analysis to understand the gene expression in a particular genome is through Gene Co-expression Networks (GCNs), where the network G = (V, E) is composed of a set of nodes (V) that represent the genes and a set of edges (E) that indicate significant co-expression relationships (Stuart et al, 2003; Junker and Schreiber, 2008). These types of networks maintain the structural properties of real networks, such as scale-free topology, which means that there are some highly, connected nodes, namely hubs, and a large number of nodes with a small number of connections (Van Noort et al, 2004; Tsaparas et al, 2006).

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call