A Dirichlet-Multinomial Bayes Classifier for Disease Diagnosis with Microbial Compositions.

Xiang Gao,Huaiying Lin,Qunfeng Dong,Katherine Mcmahon

doi:10.1128/mspheredirect.00536-17

Xiang Gao, Huaiying Lin + Show 2 more

Open Access

https://doi.org/10.1128/mspheredirect.00536-17

Copy DOI

Journal: mSphere	Publication Date: Dec 13, 2017
Citations: 6	License type: CC BY 4.0

Affiliation: Loyola University Chicago

Abstract

Dysbiosis of microbial communities is associated with various human diseases, raising the possibility of using microbial compositions as biomarkers for disease diagnosis. We have developed a Bayes classifier by modeling microbial compositions with Dirichlet-multinomial distributions, which are widely used to model multicategorical count data with extra variation. The parameters of the Dirichlet-multinomial distributions are estimated from training microbiome data sets based on maximum likelihood. The posterior probability of a microbiome sample belonging to a disease or healthy category is calculated based on Bayes' theorem, using the likelihood values computed from the estimated Dirichlet-multinomial distribution, as well as a prior probability estimated from the training microbiome data set or previously published information on disease prevalence. When tested on real-world microbiome data sets, our method, called DMBC (for Dirichlet-multinomial Bayes classifier), shows better classification accuracy than the only existing Bayesian microbiome classifier based on a Dirichlet-multinomial mixture model and the popular random forest method. The advantage of DMBC is its built-in automatic feature selection, capable of identifying a subset of microbial taxa with the best classification accuracy between different classes of samples based on cross-validation. This unique ability enables DMBC to maintain and even improve its accuracy at modeling species-level taxa. The R package for DMBC is freely available at https://github.com/qunfengdong/DMBC. IMPORTANCE By incorporating prior information on disease prevalence, Bayes classifiers have the potential to estimate disease probability better than other common machine-learning methods. Thus, it is important to develop Bayes classifiers specifically tailored for microbiome data. Our method shows higher classification accuracy than the only existing Bayesian classifier and the popular random forest method, and thus provides an alternative option for using microbial compositions for disease diagnosis.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Dirichlet-Multinomial Bayes Classifier for Disease Diagnosis with Microbial Compositions.

Abstract

Talk to us

Similar Papers

More From: mSphere

Lead the way for us

Similar Papers

Activity discovery using Dirichlet multinomial mixture models from discrete sensor data in smart homes
Ken Sadohara
Personal and Ubiquitous Computing | VOL. 26
Ken SadoharaKen Sadohara
19 Jul 2022
Personal and Ubiquitous Computing | VOL. 26

Efficient and scalable multi-class classification using naïve Bayes tree
Dewan Md Farid ... Mohammad Masudur Rahman
-
Dewan Md Farid, et. al.Dewan Md Farid ... Mohammad Masudur Rahman
01 May 2014
01 May 2014

ON DIRICHLET MULTINOMIAL DISTRIBUTIONS
Robert W Keener ... Wei Biao Wu
-
Robert W Keener, et. al.Robert W Keener ... Wei Biao Wu
01 Dec 2006
01 Dec 2006

Robust approach for estimating probabilities in Naïve–Bayes Classifier for gene expression data
B Chandra ... Manish Gupta
Expert Systems with Applications | VOL. 38
B Chandra, et. al.B Chandra ... Manish Gupta
13 Jul 2010
Expert Systems with Applications | VOL. 38

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Dirichlet-Multinomial Bayes Classifier for Disease Diagnosis with Microbial Compositions.

Abstract

Talk to us

Similar Papers

More From: mSphere