Discrete optimal Bayesian classification with error-conditioned sequential sampling

Ariana Broumand,Mohammad Shahrokh Esfahani,Byung-Jun Yoon,Edward R Dougherty

doi:10.1016/j.patcog.2015.03.023

Ariana Broumand, Mohammad Shahrokh Esfahani + Show 2 more

https://doi.org/10.1016/j.patcog.2015.03.023

Copy DOI

Abstract

When in possession of prior knowledge concerning the feature-label distribution, in particular, when it is known that the feature-label distribution belongs to an uncertainty class of distributions governed by a prior distribution, this prior knowledge can be used in conjunction with the training data to construct the optimal Bayesian classifier (OBC), whose performance is, on average, optimal among all classifiers relative to the posterior distribution derived from the prior distribution and the data. Typically in classification theory it is assumed that sampling is performed randomly in accordance with the prior probabilities on the classes and this has heretofore been true in the case of OBC. In the present paper we propose to forego random sampling and utilize the prior knowledge and previously collected data to determine which class to sample from at each step of the sampling. Specifically, we choose to sample from the class that leads to the smallest expected classification error with the addition of the new sample point. We demonstrate the superiority of the resulting nonrandom sampling procedure to random sampling on both synthetic data and data generated from known biological pathways.

Full Text