Abstract

Because of the overgrowth of data, especially in text format, the value and importance of multi-label text classification have increased. Aside from this, preprocessing and particularly intelligent feature selection (FS) are the most important step in classification. Each FS finds the best features based on its approach, but we try to use a multi-strategy approach to find more useful features. Evaluating and comparing features’ importance and relevance makes using multiple strategy and methods more suitable than conventional approaches because each feature is measured based on several perspectives. Nevertheless, the ensemble FS merges the final performance results of various methods to take advantage of different methods’ strengths and better classify. In this article, we have proposed an ensemble FS method for multi-label text data (MLTD) for the first time using the order statistics (EMFS) approach. We have utilized four multi-label FS (MLFS) algorithms with various particular performances to achieve a good result. In this method, as one of the most important statistics methods, Order Statistics was used to aggregate the ranks of different algorithms, which is robust against noise, redundant and inessential features. In the end, the performance of EMFS, executing six MLTDs, was evaluated according to six performance criteria (ranking-based and classification-based). Surprisingly, the proposed method was more accurate than others among all used MLTDs. The proposed method has improved by 1.5% compared to other methods. This value is based on the results obtained based on six evaluation criteria and all tested data sets.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.