Abstract
Extreme Learning Machine (ELM) algorithm not only has gained much attention of many scholars and researchers, but also has been widely applied in recent years especially when dealing with big data because of its better generalization performance and learning speed. The proposal of SS-ELM (semi-supervised Extreme Learning Machine) extends ELM algorithm to the area of semi-supervised learning which is an important issue of machine learning on big data. However, the original SS-ELM algorithm needs to store the data in the memory before processing it, so that it could not handle large and web-scale data sets which are of frequent appearance in the era of big data. To solve this problem, this paper firstly proposes an efficient parallel SS-ELM (PSS-ELM) algorithm on MapReduce model, adopting a series of optimizations to improve its performance. Then, a parallel approximate SS-ELM Algorithm based on MapReduce (PASS-ELM) is proposed. PASS-ELM is based on the approximate adjacent similarity matrix (AASM) algorithm, which leverages the Locality-Sensitive Hashing (LSH) scheme to calculate the approximate adjacent similarity matrix, thus greatly reducing the complexity and occupied memory. The proposed AASM algorithm is general, because the calculation of the adjacent similarity matrix is the key operation in many other machine learning algorithms. The experimental results have demonstrated that the proposed PASS-ELM algorithm can efficiently process very large-scale data sets with a good performance, without significantly impacting the accuracy of the results.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.