Abstract

Stacked ensemble, which formulates an ensemble by using a meta-learner to combine (stack) the predictions of multiple base classifiers, suffers from the problem of suboptimal performance on imbalanced classification. To improve the classification performance of stacked ensemble on imbalanced datasets, we proposed a method named Neighborhood Undersampling Stacked Ensemble (NUS-SE) in this paper. In general, the NUS-SE can be broken down into two proposed components, an undersampling based stacked ensemble framework (US-SE) component and an undersampling technique component. In the metadata generation step of stacked ensemble, a cross-validation-like procedure (CV-prediction) is commonly used. Unfortunately, incomplete metadata with missing prediction values is generated when undersampling is performed within a stacked ensemble which utilized CV-prediction as the metadata generation procedure. Therefore, in the proposed US-SE component, we replaced the standard CV-prediction procedure with our proposed method coined as Subset and Out-of-Subset (S-OOS) prediction procedure as the metadata generation method. S-OOS prediction procedure will generate metadata without missing prediction values and thus enabling the integration of undersampling within stacked ensemble. By integrating undersampling within stacked ensemble, multiple undersampled-data-subsets are used in the training of US-SE’s base learners. While in the undersampling component, we further proposed a novel undersampling technique — Neighborhood Undersampling (NUS) which selects majority instances based on their local neighborhood information. The performance of the NUS-SE is evaluated against those non-resampling based stacked ensemble as baseline methods. The experiment demonstrates that the proposed NUS-SE, which is an undersampling based stacked ensemble, is capable of achieving a better performance when compared to the non-resampling based stacked ensemble.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.