Abstract
Understanding the interrelationship among genes in a cellular system is fundamental to the investigation of cellular activities, because the interrelated genes are either functionally related, controlled by the same transcriptional regulatory process or generally take part in a common biological process, and most importantly are known to be co-expressed genes. Most latent Mtb genes have been discovered but their functions, interrelationship and correlations that will help to develop protocol (s) to tame the menace of tuberculosis disease at latency have not been fully uncovered. We have developed a computational technique called Fuzzified Adjusted Rand Index (FARI) to effectively discover the co-expressed genes from identified latent Mtb genes and perform functional analysis of the gene sets using an annotation database. FARI, a modification of Adjusted Rand index used to compare clustering results, is designed to analyze, establish and quantify the expression trend of two genes with different sample points. Rank matrix of all the genes in consideration is produced after each gene has been analyzed with others, and the rank matrix serves as the basis of the co-expression discovery. A synthetic gene expression dataset, the biological benchmark dataset (E. coli), and different set of genes containing latent Mtb genes from an experiment result were fed into the computational tool, and different gene sets (modules) representing co-expressed genes were discovered. The discovered gene modules from latent Mtb genes are used to uncover the hub genes and their molecular functions. We have been able to identify different co-expression network from this analysis and assign biological functional meanings to some of the important Mtb genes that emerge from the experiment. Also, discovering gene co-expression module births gene co-expression network, which is a preliminary step towards gene regulatory network discovery.
Highlights
Cellular activities are complex systems and have their foundation in the relationships or correlations among the cell constituents, which are represented as genes
Gene co-expression networks are extracted from microarray or RNAseq data using expression pattern as the advent of microarray technology has given system biologist opportunity to study the dynamic behaviour of genes in multiple conditions [1, 5]
Due to the size of the dataset and the number of genes generated from this experiment, corroborated by Luo et al [1] that the process of identifying cellular network in an automatic and objective fashion from genome-wide expression data remain challenging, we investigated the co-expression of the genes in scales and ranges such as the first 100 genes or genes 500 – 850
Summary
Cellular activities are complex systems and have their foundation in the relationships or correlations among the cell constituents, which are represented as genes. The interrelationship among genes in a cellular system is called Gene Co-expression Network (GCN) because genes of the same network are known to be either functionally related, controlled by the same transcriptional regulatory process or generally take part in a common biological process (i.e member of the same pathway or protein complex) [5]. A GCN is an undirected graph where each node represents a gene and an edge between two nodes represents only a correlation or dependency relationship between the genes [2, 5]. In a gene co-expression network, the genes signify a gene module and the edges indicate significant correlations [3]. A module is a set of genes with similar expression pattern in different samples of gene expression profiling. Constructing GCN is a process of developing modular networks within a cellular system, which allows us to understand the properties of the system
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.