Abstract

Scalable big data analysis frameworks are of paramount importance in the modern web society, which is characterized by a huge number of resources, including electronic text documents. Hence, choosing an adequate subset of features that provide a complete representation of the document while discarding the irrelevant one is of utmost importance. Aiming in this direction, this paper studies the suitability and importance of a deep learning classifier called Multilayer ELM (ML-ELM) by proposing a combined PageRank and content-based feature selection (CPRCFS) technique on all the terms present in a given corpus. Top \(k\%\) terms are selected to generate a reduced feature vector which is then used to train different classifiers including ML-ELM. Experimental results show that the proposed feature selection technique is better or comparable with the baseline techniques and the performance of Multilayer ELM can outperform state-of-the-arts machine and deep learning classifiers.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.