Accuracy Based Feature Ranking Metric for Multi-Label Text Classification

Muhammad Nabeel,Umar Shoaib,Abdur Rehman

doi:10.14569/ijacsa.2017.081048

Muhammad Nabeel, Umar Shoaib + Show 1 more

Open Access

https://doi.org/10.14569/ijacsa.2017.081048

Copy DOI

Abstract

In many application domains, such as machine learning, scene and video classification, data mining, medical diagnosis and machine vision, instances belong to more than one categories. Feature selection in single label text classification is used to reduce the dimensionality of datasets by filtering out irrelevant and redundant features. The process of dimensionality reduction in multi-label classification is a different scenario because here features may belong to more then one classes. Label and instance space is rapidly increasing by the grandiose of Internet, which is challenging for Multi-Label Classification (MLC). Feature selection is crucial for reduction of data in MLC. Method adaptation and data set transformation are two techniques used to select features in multi label text classification. In this paper, we present dataset transformation technique to reduce the dimensionality of multi-label text data. We used two model transformation approaches: Binary Relevance, and Label Power set for transformation of data from multi-label to single label. The Process of feature selection is done using filter approach which utilizes the data to decide the importance of features without applying learning algorithm. In this paper we used a simple measure (ACC2) for feature selection in multi-label text data. We used problem transformation approach to apply single label feature selection measures on multi-label text data; did the comparison of ACC2 with two other feature selection methods, information gain (IG) and Relief measure. Experimentation is done on three bench mark datasets and their empirical evaluation results are shown. ACC2 is found to perform better than IG and Relief in 80% cases of our experiments.

Highlights

A feature is a measurable characteristic or property of the observed process
Documents belong to only one label but in multi-label classification, which is a case in real world scenario like web pages, newspapers, sports magazine, data mining etc., a document can belong to more than one class that has become recent research topic [1]
Evaluation measures used for multi-label classification are different from those used for single label classification

Summary

Introduction

A feature is a measurable characteristic or property of the observed process. Text data is high dimensional in nature, and a moderate sized dataset may contain thousands of features. Feature selection (FS) is a data pre-processing step in many machine learning applications, which plays an important role in reduction of dimensionality [24]. It helps in mitigating the computational requirements and understanding data. Individual evaluation is computationally efficient it evaluate and assign the weights (ranks) to features (variables) according to their prediction ability in classification It ignores the inter-dependency of features and incapable of removing redundant features [21]. The main objective of feature selection is to select subset of features having stronger discrimination power [19] It reduces effects of redundancy and noise variables by keeping only the features which are efficient for prediction [3]

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Advanced Computer Science and Applications	Publication Date: Jan 1, 2017
Citations: 5	License type: cc-by

R Discovery Prime

R Discovery Prime

Accuracy Based Feature Ranking Metric for Multi-Label Text Classification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Advanced Computer Science and Applications

Lead the way for us

Similar Papers

A Proposed Arabic Text Classification Model using Multi-Label System
Hussain A Rahmana ... Salwa S Baawi
Journal of Al-Qadisiyah for computer science and mathematics | VOL. 15
Hussain A Rahmana, et. al.Hussain A Rahmana ... Salwa S Baawi
30 Sep 2023
Journal of Al-Qadisiyah for computer science and mathematics | VOL. 15

A Dynamic Two-Layers MI and Clustering-based Ensemble Feature Selection for Multi-Labels Text Classification
Adil Yaseen Taha ... Sabrina Tiun
International Journal of Advanced Computer Science and Applications | VOL. 11
Adil Yaseen Taha, et. al.Adil Yaseen Taha ... Sabrina Tiun
01 Jan 2020
International Journal of Advanced Computer Science and Applications | VOL. 11

Learning Local and Global Features for Optimized Multi-Label Text Classification
Muhammad Rafi ... Fizza Abid
-
Muhammad Rafi, et. al.Muhammad Rafi ... Fizza Abid
22 Nov 2022
22 Nov 2022

Multi-label Classification for Clinical Text with Feature-level Attention
Disheng Pan ... Mengya Li
-
Disheng Pan, et. al.Disheng Pan ... Mengya Li
01 May 2020
01 May 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Accuracy Based Feature Ranking Metric for Multi-Label Text Classification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Advanced Computer Science and Applications