Abstract

The marginal Bayesian predictive classifiers (mBpc), as opposed to the simultaneous Bayesian predictive classifiers (sBpc), handle each data separately and, hence, tacitly assume the independence of the observations. Due to saturation in learning of generative model parameters, the adverse effect of this false assumption on the accuracy of mBpc tends to wear out in the face of an increasing amount of training data, guaranteeing the convergence of these two classifiers under the de Finetti type of exchangeability. This result, however, is far from trivial for the sequences generated under Partition Exchangeability (PE), where even umpteen amount of training data does not rule out the possibility of an unobserved outcome (Wonderland!). We provide a computational scheme that allows the generation of the sequences under PE. Based on that, with controlled increase of the training data, we show the convergence of the sBpc and mBpc. This underlies the use of simpler yet computationally more efficient marginal classifiers instead of simultaneous. We also provide a parameter estimation of the generative model giving rise to the partition exchangeable sequence as well as a testing paradigm for the equality of this parameter across different samples. The package for Bayesian predictive supervised classifications, parameter estimation and hypothesis testing of the Ewens sampling formula generative model is deposited on CRAN as PEkit package.

Highlights

  • Under the broad realm of inductive inference, the goal of the supervised classification is to assign the test objects into a priori defined number of classes learned from the training data [1]

  • The likelihood L of multiple independent samples from the PD distribution is a product of the density functions for the partitions ρ of those samples, and under H1 the maximum likelihood estimates (MLE) of ψ for each of the samples is evaluated as in (3) from each sample independently, as the other samples have no effect on the ψ of a single sample

  • We provided an estimation and hypothesis testing scheme for the dispersal parameter of the PD distribution tied with the sequences under Partition Exchangeability (PE)

Read more

Summary

Introduction

Under the broad realm of inductive inference, the goal of the supervised classification is to assign the test objects into a priori defined number of classes learned from the training data [1]. To presentation of the predictive rules under this type of exchangeablity [11] and asymptotic representation of the number of classes for the sampling species sequences [12], our derivation based on the Bayesian classifiers shows that given an infinite amount of data, the simultaneous and marginal predictive classifiers will converge asymptotically. This is congruent with the similar study under the de Finetti exchangeability with multinomial modelling [5].

Partition Exchangeability
Parameter Estimation
Hypothesis Testing
Supervised Classifiers under PE
Numerical Illustrations Underlying Convergence
Discussion
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call