Abstract

The big data phenomenon is currently a challenge to the process of relevant knowledge extraction using classical machine learning technique. This is due to the need for efficient data reduction and new fast-distributed machine learning algorithms for such process on big data. The extensive application of SVM demands efficient methods of constructing the classifier to be suitable for big data and high classification capability. In reality, the efficiency of SVM relies on the efficient derivation of the optimal feature subset and the algorithmic parameters. The grid search optimization method usually presents global optima and high learning accuracy compared to PSO and GA, but its larger computation takes much time. The grid search is more attractive because it can simultaneously take part in the learning of every SVM since they do not rely on each other. A novel parallel implementation of grid optimization using Spark Radoop is proposed in this paper to minimize the great computation load and make it suitable for big data processing issues. A major contribution of this study is a significant reduction in the algorithmic computational time when compared to the serial version of gridSVM, as well as the high classification accuracy compared to the other parallel optimization techniques.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.