Abstract

The problem with development of the support vector machine (SVM) classifiers using modified particle swarm optimization (PSO) algorithm and their ensembles has been considered. Solving this problem would allow fulfilling the high-precision data classification, especially Big Data classification, with the acceptable time expenditures. The modified PSO algorithm conducts a simultaneous search of the type of kernel functions, the parameters of the kernel function and the value of the regularization parameter for the SVM classifier. The idea of particles' «regeneration» served as the basis for the modified PSO algorithm. In the implementation of this algorithm, some particles change the type of their kernel function to the one which corresponds to the particle with the best value of the classification accuracy. The offered PSO algorithm allows reducing the time expenditures for the developed SVM classifiers, which is very important for Big Data classification problem. In most cases such SVM classifier provides the high quality of data classification. In exceptional cases the SVM ensembles based on the decorrelation maximization algorithm for the different strategies of the decision-making on the data classification and the majority vote rule can be used. Also, the two-level SVM classifier has been offered. This classifier works as the group of the SVM classifiers at the first level and as the SVM classifier on the base of the modified PSO algorithm at the second level. The results of experimental studies confirm the efficiency of the offered approaches for Big Data classification.

Highlights

  • Big Data is a term for data sets that are so large and/or complex that traditional data processing technologies are inadequate

  • This approach is based on the application of the modified particle swarm optimization (PSO) algorithm, the main idea of which is the «regeneration» of particles: some particles change their kernel function type to the one which corresponds to the particle with the best value of the classification accuracy

  • In the case of the support vector machine (SVM) classifier's development with the use of the PSO algorithm the swarm particles can be defined by vectors declaring their position in the search space and corded by the kernel function parameters and the regularization parameter:, where i is a number of particle (

Read more

Summary

INTRODUCTION

Big Data is a term for data sets that are so large and/or complex that traditional data processing technologies are inadequate. This approach is based on the application of the modified PSO algorithm, the main idea of which is the «regeneration» of particles: some particles change their kernel function type to the one which corresponds to the particle with the best value of the classification accuracy.

THE SUPPORT VECTOR MACHINE CLASSIFIER
THE MODIFIED PARTICLE SWARM OPTIMIZATION ALGORITHM
THE SUPPORT VECTOR MACHINE ENSEMBLE
TWO-LEVEL SVM CLASSIFIER
EXPERIMENTAL STUDIES
Result
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call