Abstract

A disease molecular signature is a set of biomolecular features that are prognostic of clinical phenotypes and indicative of underlying pathology. It is of great importance to develop computational approaches for finding more relevant molecular signatures. Based upon the hypothesis that various components in a molecular signature are more likely to share similar patterns, we introduced a novel three step network based approach (TSNBA) to identify the molecular signature and key pathological regulators. Protein-protein interaction (PPI) network and ranking algorithm were integrated in the first step to find pathology related proteins with high accuracy. It was followed by the second step to further screen with co-expression patterns for better pathology enrichment. Context likelihood of relatedness (CLR) algorithm was used in the third step to infer gene regulatory networks and identify key transcription regulators. We applied this approach to study IL-1 (interleukin-1) and TNF-alpha (tumor necrosis factor-alpha) stimulated inflammation. TSNBA identified inflammatory signature with high accuracy and outperformed 5 competing methods namely fold change, degree, interconnectivity, neighborhood score and network propagation based approaches. The best molecular signature, with 80% (40/50) confirmed inflammatory genes, was used to predict inflammation related genes. As a result, 8 out of 10 predicted inflammation genes that were not included in the benchmark Entrez Gene database were validated by literature evidence. Furthermore, 23 of the 32 predicted inflammation regulators were validated by literature evidence. The rest 9 were also validated with TF (transcription factor) binding site analysis. In conclusion, we developed an efficient strategy for disease molecular signature finding and key pathological regulator identification.

Highlights

  • Molecular signature is defined as a set of biomolecular features that can be used as markers for a particular phenotype and underlying condition-related biological mechanisms

  • Signature components obtained from principal component analysis (PCA) and partial least squares (PLS) are often difficult for interpretation

  • three step network based approach (TSNBA) identified better inflammation enriched signature The final Protein-protein interaction (PPI) network used in this study consisted of 7633 genes and 30995 interactions. 1469 human TFs derived from AnimalTFDB database and 1462 inflammatory genes extracted from Entrez Gene database were included in the network

Read more

Summary

Introduction

Molecular signature is defined as a set of biomolecular features that can be used as markers for a particular phenotype and underlying condition-related biological mechanisms. They can be a set of genes, proteins, metabolites, genetic variants and microRNAs. Molecular signatures have been derived and applied for various purposes [1,2] including disease diagnosis and risk assessment [3,4,5,6,7], prediction of physiological toxicity [8,9] and response to therapeutic drugs [10,11]. Methods integrating multiple data sets, multiple data types with network-based approaches have been shown to find accurate and robust molecular signatures [1]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call