Abstract

A support vector machine (SVM) is not a popular method for a very large dataset classification because the training and testing time for such data are computationally expensive. Many researchers try to reduce the training time of SVMs by applying sample reduction methods. Many methods reduced the training samples by using a clustering technique. To reduce its high computational complexity, several data reduction methods were proposed in previous studies. However, such methods are not effective to extract informative patterns. This paper demonstrates a new supervised classification method, multiseed-based SVM (MSB-SVM), which is particularly intended to deal with very large datasets for multiclass classification. The main contributions of the paper are (i) an efficient multiseed technique for selection of seed points from circular/elongated class training samples, (ii) adjacent class pair selection from the set of multiseeds by using the minimum spanning tree, and (iii) extraction of support vectors from class pair seed equivalent regions to manage multiclass classification problems without being computationally expensive. Experimental results on a variety of datasets showed better performance compared to other sample-reducing methods in terms of training and testing time. Traditional support vector machine (SVM) solution suffers from $O(n^{2})$ time complexity, which makes it impractical for very large datasets. Here, multiseed point technique depends on the estimated density of each data, and the order of computation is $O(n$ log $n)$. Using the estimated density, the computational cost of the seed selection algorithm is $O(n)$. So, this is the only burden for reducing the sample. However, reducing the sample takes less time with the proposed algorithm compared to the clustering methods. At the same time, the number of support vectors has been abruptly reduced, which takes less time to find the decision surface. Apart from this, the classification accuracy of the proposed technique is significantly better than other existing sample reduction methods especially for large datasets.

Highlights

  • Image classification is extremely helpful for classifying satellite images

  • The MSB-support vector machine (SVM) classification technique reduces the computational cost in two ways: firstly, an efficient training sample selection method is implemented based on the multiseed technique without clustering the data, and secondly, multiclass problems are handled using the minimum spanning tree (MST) to improve the cost efficiency of the proposed classification technique

  • We have presented a new approach for handling large datasets and authentic SVM classification

Read more

Summary

Introduction

Image classification is extremely helpful for classifying satellite images. Different classification procedures are described in the literature; for example, nearest neighbor classifiers, artificial neural networks, and support vector machines (SVMs). There are various sample reduction techniques which reduce the computational burden of SVMs, such as selective sampling, random sampling, and clustering-based SVMs but they are themselves very complex for large datasets [11]. Selection of important data from a big data bank, especially when nonstationary, combined of both old and new data samples, is a very critical problem due to computational complexity In this context, Lin et al [19] proposed a representative data detection methodology based on pattern recognition techniques. The MSB-SVM classification technique reduces the computational cost in two ways: firstly, an efficient training sample selection method is implemented based on the multiseed technique without clustering the data, and secondly, multiclass problems are handled using the minimum spanning tree (MST) to improve the cost efficiency of the proposed classification technique.

The multiseed technique
The proposed MSB-SVM technique
Experimental results and discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call