On optimal Bayesian classification and risk estimation under multiple classes

Lori A Dalton,Mohammadmahdi R Yousefi

doi:10.1186/s13637-015-0028-3

Lori A Dalton, Mohammadmahdi R Yousefi

Open Access

https://doi.org/10.1186/s13637-015-0028-3

Copy DOI

Abstract

A recently proposed optimal Bayesian classification paradigm addresses optimal error rate analysis for small-sample discrimination, including optimal classifiers, optimal error estimators, and error estimation analysis tools with respect to the probability of misclassification under binary classes. Here, we address multi-class problems and optimal expected risk with respect to a given risk function, which are common settings in bioinformatics. We present Bayesian risk estimators (BRE) under arbitrary classifiers, the mean-square error (MSE) of arbitrary risk estimators under arbitrary classifiers, and optimal Bayesian risk classifiers (OBRC). We provide analytic expressions for these tools under several discrete and Gaussian models and present a new methodology to approximate the BRE and MSE when analytic expressions are not available. Of particular note, we present analytic forms for the MSE under Gaussian models with homoscedastic covariances, which are new even in binary classification.

Highlights

Classification in biomedicine is often constrained by small samples so that understanding properties of the error rate is critical to ensure the scientific validity of a designed classifier
This can be done with linear discriminant analysis (LDA) and quadratic discriminant analysis (QDA) for multiple classes with arbitrary loss functions, which essentially assume that the underlying class-conditional densities are Gaussian with equal or unequal covariances, respectively
The analytic form that we provide for the mean-square error (MSE) of arbitrary error estimators under homoscedastic models is completely new without an analog in prior work under binary classification and zero-one loss

Summary

Introduction

Classification in biomedicine is often constrained by small samples so that understanding properties of the error rate is critical to ensure the scientific validity of a designed classifier. A few classical classification algorithms naturally permit multiple classes and arbitrary loss functions; for example, a plug-in rule takes the functional form for an optimal Bayes decision rule under a given modeling assumption and substitutes sample estimates of model parameters in place of the true parameters. This can be done with linear discriminant analysis (LDA) and quadratic discriminant analysis (QDA) for multiple classes with arbitrary loss functions, which essentially assume that the underlying class-conditional densities are Gaussian with equal or unequal covariances, respectively. We present a new computationally efficient method to approximate the conditional MSE based on the effective joint density

Bayes decision theory

Optimal Bayesian risk classification

Classification rules We consider five classification rules

Findings

Risk estimation rules We consider four risk estimation methods

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: EURASIP Journal on Bioinformatics and Systems Biology	Publication Date: Oct 24, 2015
Citations: 30	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

On optimal Bayesian classification and risk estimation under multiple classes

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EURASIP Journal on Bioinformatics and Systems Biology

Lead the way for us

Similar Papers

Optimal classifiers with minimum expected error within a Bayesian framework — Part II: Properties and performance analysis
Lori A Dalton ... Edward R Dougherty
Pattern Recognition | VOL. 46
Lori A Dalton, et. al.Lori A Dalton ... Edward R Dougherty
02 Nov 2012
Pattern Recognition | VOL. 46

Incorporating biological prior knowledge for Bayesian learning via maximal knowledge-driven information priors
Shahin Boluki ... Mohammad Shahrokh Esfahani
BMC Bioinformatics | VOL. 18
Shahin Boluki, et. al.Shahin Boluki ... Mohammad Shahrokh Esfahani
01 Dec 2017
BMC Bioinformatics | VOL. 18

Bayesian multivariate Poisson model for RNA-seq classification
Jason Knight ... Ivan Ivanov
-
Jason Knight, et. al.Jason Knight ... Ivan Ivanov
01 Nov 2013
01 Nov 2013

Constrained C0 Finite Element Methods for Biharmonic Problem
Rong An ... Xuehai Huang
Abstract and Applied Analysis | VOL. 2012
Rong An, et. al.Rong An ... Xuehai Huang
01 Jan 2012
Abstract and Applied Analysis | VOL. 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

On optimal Bayesian classification and risk estimation under multiple classes

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EURASIP Journal on Bioinformatics and Systems Biology