Abstract
Faced with massive amounts of image data, the performance of classification algorithms based on traditional platforms with single-node architecture drops dramatically. We propose a classification method based on hybrid optimization and combination technology in a cluster environment that is suitable for use with large-scale scene images. Support vector machine (SVM) algorithms are optimized by the artificial bee colony and particle swarm optimization algorithms to produce weak classifiers; then, a strong classifier is constructed by combining the outputs from the 15 weak classifiers using the AdaBoost algorithm. The MapReduce parallel programming model in the Hadoop platform is used to parallelize the algorithm, and a parallel AdaBoost hybrid optimization (PAH)-SVM algorithm is proposed. Finally, a model is constructed for automatic classification of the large-scale scene images. Multiple sets of comparative experiments show that the average classification accuracy of the proposed algorithm when applied to the scene understanding (Caltech-256 and Pascal VOC 2012) database exceeds 85.0%, and its training time is <10 min when 170,000 images are used. Considering the cost of hardware, the execution time and accuracy of this algorithm are superior to those of mainstream classification algorithms, such as P-SVM and CNN. In addition, the speed of the system based on the proposed algorithm increases linearly, and the constructed Hadoop cluster shows good extensibility. The proposed algorithm is suitable for automatic classification and prediction using large-scale scene images.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.