A Cascading Incremental Training Approach for Large-Scale Distributed Data Based on Support Vector Machine

Xu Yuan Yuan,Shucheng Li,Xiaofeng Gu,Fan Li,Sun Rui

doi:10.1109/iccwamtip51612.2020.9317304

Abstract

In this paper, we present a cascading incremental training approach for large-scale and unevenly distributed data based on Support Vector Machine. Instead of using training data directly, data's distribution and scale are concerned, so Synthetic Minority Oversampling Technique and Generative Adversarial Networks are both used to synthesize new samples and add them to the data set to achieve a balanced sample number. Then, we propose a cascading model and divide samples into groups and apply the multi-layer convolutional neural network for pre-training to collect the embedding vectors of training samples. Finally, features extracted from the pre-training will be used for incremental training on Support Vector Machine. A multi-layer convolutional network and cluster are applied to maintain some key information and train features for classification. Additionally, uncertainty strategy is used to obtain potential features. In the comparative experiments with the above dataset, our proposed method performs the state-of-the-art competitors.

Full Text