Abstract

In this paper, the Bayesian Data Reduction Algorithm (BDRA) is extended to model uncertainty in feature information. The new method works by making a data observation spread over multiple discretized bins in a way that is analogous to convolving it with a blur function prior to being quantized. The motivation for this is to enforce some notion of closeness between discretized bins that actually are close prior to quantization -- the original model incorporates no such notion, and in fact the number of observed training data in one bin bears no stronger relationship to a bin that is near than one that is distant, in the sense of the underlying unquantized data. This has the effect that performance of the BDRA can be improved in difficult classification situations involving very small numbers of training data. The BDRA is based on the assumption that the discrete symbol probabilities of each class are a priori uniformly Dirichlet distributed, and it employs a greedy approach (similar to a backward sequential feature search) for reducing irrelevant features from the training data of each class. Notice that reducing irrelevant features is synonymous here with selecting those features that provide best classification performance; the metric for making data reducing decisions is an analytic formula for the probability of error conditioned on the training data. To illustrate its performance with the new extended algorithm results will be shown using both real and simulated data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call