Abstract

Based on hierarchical clustering and re-sampling,this paper presented a Support Vector Machine(SVM) classification method for large-scale data,which combined supervised learning with unsupervised learning.The proposed method first used k-means cluster analytical technology to partition dataset into several subsets.Then,the method clustered class by class for each subset and selected samples in each clustering center neighborhood to form candidate training datasets.Last,the method applied SVM to train and model for candidate training datasets.The experimental results show that the proposed method can substantially reduce SVM learning cost.Meanwhile,the proposed method has better classification accuracy than random re-sampling method,and can attain about the same classification accuracy of the non-sampling method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call