Abstract
BackgroundA major challenge of bioinformatics in the era of precision medicine is to identify the molecular biomarkers for complex diseases. It is a general expectation that these biomarkers or signatures have not only strong discrimination ability, but also readable interpretations in a biological sense. Generally, the conventional expression-based or network-based methods mainly capture differential genes or differential networks as biomarkers, however, such biomarkers only focus on phenotypic discrimination and usually have less biological or functional interpretation. Meanwhile, the conventional function-based methods could consider the biomarkers corresponding to certain biological functions or pathways, but ignore the differential information of genes, i.e., disregard the active degree of particular genes involved in particular functions, thereby resulting in less discriminative ability on phenotypes. Hence, it is strongly demanded to develop elaborate computational methods to directly identify functional network biomarkers with both discriminative power on disease states and readable interpretation on biological functions.ResultsIn this paper, we present a new computational framework based on an integer programming model, named as Comparative Network Stratification (CNS), to extract functional or interpretable network biomarkers, which are of strongly discriminative power on disease states and also readable interpretation on biological functions. In addition, CNS can not only recognize the pathogen biological functions disregarded by traditional Expression-based/Network-based methods, but also uncover the active network-structures underlying such dysregulated functions underestimated by traditional Function-based methods. To validate the effectiveness, we have compared CNS with five state-of-the-art methods, i.e. GSVA, Pathifier, stSVM, frSVM and AEP on four datasets of different complex diseases. The results show that CNS can enhance the discriminative power of network biomarkers, and further provide biologically interpretable information or disease pathogenic mechanism of these biomarkers. A case study on type 1 diabetes (T1D) demonstrates that CNS can identify many dysfunctional genes and networks previously disregarded by conventional approaches.ConclusionTherefore, CNS is actually a powerful bioinformatics tool, which can identify functional or interpretable network biomarkers with both discriminative power on disease states and readable interpretation on biological functions. CNS was implemented as a Matlab package, which is available at http://www.sysbio.ac.cn/cb/chenlab/images/CNSpackage_0.1.rar.
Highlights
A major challenge of bioinformatics in the era of precision medicine is to identify the molecular biomarkers for complex diseases
Comparative Network Stratification (CNS) was implemented as a Matlab package, which is available at http://www.sysbio.ac.cn/cb/chenlab/ images/CNSpackage_0.1.rar
Network-based methods, such as frSVM [5] and stSVM algorithm [6], were proposed to extract the active sub-networks as network biomarkers, by considering biological network information (Fig. 1). Such analysis could make us further interpret the mechanisms of complex diseases at a system level [7]. network biomarkers pay attention on a growing consensus that complex diseases are mostly contributed by multiple genes through their sophisticated interactions rather than by the individual genes [8, 9], network-based analysis could not directly elucidate the biological or functional roles of the excavated genes/interactions on specific conditions or samples due to the accumulated interaction information on all circumstances
Summary
A major challenge of bioinformatics in the era of precision medicine is to identify the molecular biomarkers for complex diseases. Network-based methods, such as frSVM [5] and stSVM algorithm [6], were proposed to extract the active sub-networks as network biomarkers, by considering biological network information (Fig. 1) Such analysis could make us further interpret the mechanisms of complex diseases at a system level [7]. network biomarkers pay attention on a growing consensus that complex diseases are mostly contributed by multiple genes through their sophisticated interactions rather than by the individual genes [8, 9], network-based analysis could not directly elucidate the biological or functional roles of the excavated genes/interactions on specific conditions or samples due to the accumulated interaction information on all circumstances. The biological annotation deposited in databases is assembled from different resources or projects on various conditions, which makes it hard to precisely determine the actual states of particular biological functions under a specific condition, e.g. when a disease occurs to a person with certain genetic or epigenetic background
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.