Data dependency in multiple classifier systems

Rozita A Dara,Mohamed S Kamel,Nayer Wanas

doi:10.1016/j.patcog.2008.11.035

Abstract

In this paper, the data dependency of aggregation modules in multiple classifier system is being investigated. We first propose a new categorization scheme, in which combining methods are grouped into data-independent, implicitly data-dependent and explicitly data-dependent. It is argued that data-dependent approaches present the highest potential for improved performance. In this study, we intend to provide a comprehensive investigation of this argument and explore the impact of data dependency on the performance of multiple classifiers. We evaluate this impact based on two criteria, prediction accuracy and stability. In addition, we examine the effect of class imbalance and uneven data distribution on these two criteria. This paper presents the findings of an extensive set of comparative experiments. Based on the findings, it can be concluded that data-dependent aggregation methods are generally more stable and less sensitive to class imbalance. In addition, data-dependent methods exhibited superior or identical generalization ability for most of the data sets.

Full Text