Abstract

Ensemble-based methods are highly popular approaches that increase the accuracy of a decision by aggregating the opinions of individual voters. The common point is to maximize accuracy; however, a natural limitation occurs if incremental costs are also assigned to the individual voters. Consequently, we investigate creating ensembles under an additional constraint on the total cost of the members. This task can be formulated as a knapsack problem, where the energy is the ensemble accuracy formed by some aggregation rules. However, the generally applied aggregation rules lead to a nonseparable energy function, which takes the common solution tools—such as dynamic programming—out of action. We introduce a novel stochastic approach that considers the energy as the joint probability function of the member accuracies. This type of knowledge can be efficiently incorporated in a stochastic search process as a stopping rule, since we have the information on the expected accuracy or, alternatively, the probability of finding more accurate ensembles. Experimental analyses of the created ensembles of pattern classifiers and object detectors confirm the efficiency of our approach over other pruning ones. Moreover, we propose a novel stochastic search method that better fits the energy, which can be incorporated in other stochastic strategies as well.

Highlights

  • Ensemble-based systems are rather popular in several application fields and are employed to increase the decision accuracy of individual approaches

  • We show that this type of knowledge can be efficiently incorporated in any stochastic search process as a stopping rule, since we have the information on the expected accuracy or, alternatively, the probability of finding more accurate ensembles

  • We estimate the distribution of q in terms of its mean and variance. This information can be efficiently incorporated as a stopping rule in stochastic search algorithms, as we demonstrate it e.g. for simulated annealing (SA)

Read more

Summary

Introduction

Ensemble-based systems are rather popular in several application fields and are employed to increase the decision accuracy of individual approaches. There are efforts to complement the basic ensemble pruning models to consider possible resource constraints like training/test execution times or memory/storage space (Bucilu et al 2006; Hinton et al 2015) as well. To reach this aim a popular approach is to apply multi-objective evolutionary algorithms, like NSGA-II (Deb et al 2002). Besides its individual accuracy and cost, we calculate such a usefulness value for each possible member during the selection process that reflects its direct behavior according to the objective function, which is based on the majority voting rule in our case.

Basic concepts and notation
Deterministic selection strategies
Stochastic search algorithms
Stochastic estimation of ensemble energy
Estimation of the distribution of member accuracies
Stopping rule for ensemble selection
Empirical analysis
Kaggle challenges
Binary classification problems
Discussion
Findings
The variance of q is expressed by
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call