Incorporating biological prior knowledge for Bayesian learning via maximal knowledge-driven information priors

Shahin Boluki,Edward R Dougherty,Xiaoning Qian,Mohammad Shahrokh Esfahani

doi:10.1186/s12859-017-1893-4

Shahin Boluki, Edward R Dougherty + Show 2 more

Open Access

https://doi.org/10.1186/s12859-017-1893-4

Copy DOI

Abstract

BackgroundPhenotypic classification is problematic because small samples are ubiquitous; and, for these, use of prior knowledge is critical. If knowledge concerning the feature-label distribution – for instance, genetic pathways – is available, then it can be used in learning. Optimal Bayesian classification provides optimal classification under model uncertainty. It differs from classical Bayesian methods in which a classification model is assumed and prior distributions are placed on model parameters. With optimal Bayesian classification, uncertainty is treated directly on the feature-label distribution, which assures full utilization of prior knowledge and is guaranteed to outperform classical methods.ResultsThe salient problem confronting optimal Bayesian classification is prior construction. In this paper, we propose a new prior construction methodology based on a general framework of constraints in the form of conditional probability statements. We call this prior the maximal knowledge-driven information prior (MKDIP). The new constraint framework is more flexible than our previous methods as it naturally handles the potential inconsistency in archived regulatory relationships and conditioning can be augmented by other knowledge, such as population statistics. We also extend the application of prior construction to a multinomial mixture model when labels are unknown, which often occurs in practice. The performance of the proposed methods is examined on two important pathway families, the mammalian cell-cycle and a set of p53-related pathways, and also on a publicly available gene expression dataset of non-small cell lung cancer when combined with the existing prior knowledge on relevant signaling pathways.ConclusionThe new proposed general prior construction framework extends the prior construction methodology to a more flexible framework that results in better inference when proper prior knowledge exists. Moreover, the extension of optimal Bayesian classification to multinomial mixtures where data sets are both small and unlabeled, enables superior classifier design using small, unstructured data sets. We have demonstrated the effectiveness of our approach using pathway information and available knowledge of gene regulating functions; however, the underlying theory can be applied to a wide variety of knowledge types, and other applications when there are small samples.

Highlights

Phenotypic classification is problematic because small samples are ubiquitous; and, for these, use of prior knowledge is critical
At each step in a BNp, a decision is made by a Bernoulli random variable with the success probability equal to the perturbation probability, ppert, as to whether a node value is determined by perturbation of randomly flipping its value or by the logic model imposed from the interactions in the signaling pathways
The performance of the proposed framework is compared with other methods on a publicly available gene expression dataset of non-small cell lung cancer when combined with the existing prior knowledge on relevant signaling pathways

Summary

Introduction

Phenotypic classification is problematic because small samples are ubiquitous; and, for these, use of prior knowledge is critical. With optimal Bayesian classification, uncertainty is treated directly on the feature-label distribution, which assures full utilization of prior knowledge and is guaranteed to outperform classical methods. If knowledge concerning the feature-label distribution is available, say, genetic pathways, it can be used to design an optimal Bayesian classifier (OBC) for which uncertainty is treated directly on the feature-label distribution. There must exist a general formal theory of determination of priors by logical analysis of prior information – and that to develop it is today the top priority research problem of Bayesian theory”. It is precisely this kind of formal structure that is presented in this paper. The constraints tighten the prior distribution in accordance with prior knowledge, while at the same time avoiding inadvertent over restriction of the prior, an important consideration with small samples

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Dec 1, 2017
Citations: 37	License type: open-access

R Discovery Prime

R Discovery Prime

Incorporating biological prior knowledge for Bayesian learning via maximal knowledge-driven information priors

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Constructing Pathway-Based Priors within a Gaussian Mixture Model for Bayesian Regression and Classification.
Shahin Boluki ... Mohammad Shahrokh Esfahani
IEEE/ACM transactions on computational biology and bioinformatics | VOL. 16
Shahin Boluki, et. al.Shahin Boluki ... Mohammad Shahrokh Esfahani
30 Nov 2017
IEEE/ACM transactions on computational biology and bioinformatics | VOL. 16

Optimal classifiers with minimum expected error within a Bayesian framework — Part II: Properties and performance analysis
Lori A Dalton ... Edward R Dougherty
Pattern Recognition | VOL. 46
Lori A Dalton, et. al.Lori A Dalton ... Edward R Dougherty
02 Nov 2012
Pattern Recognition | VOL. 46

Optimal Bayesian Classification With Missing Values
Siamak Zamani Dadaneh ... Xiaoning Qian
IEEE Transactions on Signal Processing | VOL. 66
Siamak Zamani Dadaneh, et. al.Siamak Zamani Dadaneh ... Xiaoning Qian
15 Aug 2018
IEEE Transactions on Signal Processing | VOL. 66

Incorporating prior knowledge induced from stochastic differential equations in the classification of stochastic observations.
Amin Zollanvari ... Edward R Dougherty
EURASIP Journal on Bioinformatics and Systems Biology | VOL. 2016
Amin Zollanvari, et. al.Amin Zollanvari ... Edward R Dougherty
20 Jan 2016
EURASIP Journal on Bioinformatics and Systems Biology | VOL. 2016

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Incorporating biological prior knowledge for Bayesian learning via maximal knowledge-driven information priors

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics