Abstract

In many ensemble classification paradigms, the function which combines local/base classifier decisions is learned in a supervised fashion. Such methods require common labeled training examples across the classifier ensemble. However, in some scenarios, where an ensemble solution is necessitated, common labeled data may not exist: (i) legacy/proprietary classifiers, and (ii) spatially distributed and/or multiple modality sensors. In such cases, it is standard to apply fixed (untrained) decision aggregation such as voting, averaging, or naive Bayes rules. In recent work, an alternative transductive learning strategy was proposed. There, decisions on test samples were chosen aiming to satisfy constraints measured by each local classifier. This approach was shown to reliably correct for class prior mismatch and to robustly account for classifier dependencies. Significant gains in accuracy over fixed aggregation rules were demonstrated. There are two main limitations of that work. First, feasibility of the constraints was not guaranteed. Second, heuristic learning was applied. Here, we overcome these problems via a transductive extension of maximum entropy/improved iterative scaling for aggregation in distributed classification. This method is shown to achieve improved decision accuracy over the earlier transductive approach and fixed rules on a number of UC Irvine datasets.

Highlights

  • There has been a great deal of research on techniques for building ensemble classification systems, (e.g., [1,2,3,4,5,6,7,8,9,10])

  • Even in the feasible case, there is no unique feasible solution—feasible solutions found by [15] are not guaranteed to possess any special properties or good test set accuracy. We address these problems by proposing a transductive extension of maximum entropy/improved iterative scaling (ME/IIS) [34,35,36] for aggregation in distributed classification

  • We present both illustrative results and experiments comparing the Transductive iterative scaling (TIS) algorithm (Section 4.5, with the choice β = 0) and Extended TIS (ETIS) algorithm with a variety of alternative transductive and fixed combining schemes

Read more

Summary

INTRODUCTION

There has been a great deal of research on techniques for building ensemble classification systems, (e.g., [1,2,3,4,5,6,7,8,9,10]). In [14], fixed combining was derived under the assumption that feature vectors of the local classifiers are jointly Gaussian, with known correlation structure over the joint feature space (i.e., across the local classifiers) Neither these methods nor other past methods for distributed classification have considered learning the aggregation function. The clinic would like to leverage the biomarkers (and associated classifiers) from each of the studies in making decisions for its patients This again amounts to distributed classification without common labeled training examples. We address these problems by proposing a transductive extension of maximum entropy/improved iterative scaling (ME/IIS) [34,35,36] for aggregation in distributed classification This approach ensures both feasibility of constraints and uniqueness of the solution. The paper concludes with a discussion and pointer to future work

DISTRIBUTED CLASSIFICATION PROBLEM
Transductive maximum likelihood methods
Transductive constraint-based learning
Choice of constraints
CB learning approach
TRANSDUCTIVE CB BY MAXIMUM ENTROPY
Augmentation with local classifier supports
Augmentation with support derived from constraints
Full support in the hard decision case
Constraint relaxation
Comments on ETIS algorithm
An infeasible constraint example
Real data experimental results
Influence of β
CONCLUSIONS
Proof of feasibility of Problem 2
Proof that TIS updates descend in the Lagrangian

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.