HMATC: Hierarchical multi-label Arabic text classification model using machine learning

Nawal Aljedani,Reem Alotaibi,Mounira Taileb

doi:10.1016/j.eij.2020.08.004

Nawal Aljedani, Reem Alotaibi + Show 1 more

Open Access

https://doi.org/10.1016/j.eij.2020.08.004

Copy DOI

Export

Save

Cite

Journal: Egyptian Informatics Journal	Publication Date: Sep 22, 2020
Citations: 26	License type: cc-by-nc-nd

Affiliation: King Abdulaziz University

Abstract
Full-Text
Similar Papers

Abstract

Listen

Multi-label classification assigns multiple labels to each document concurrently. Many real-world classification problems tend to employ high-dimensional label spaces, which can be naturally structured in a hierarchy. In this type of problem, each instance may belong to multiple labels and labels are organized in a hierarchical structure. It presents a more complex problem than flat classification, given that the classification algorithm has to take into account hierarchical relationships between labels and be able to predict multiple labels for the same instance. Few studies have investigated multi-label text classification for the Arabic language. Most of these studies have focused mainly on flat classification and have neglected the hierarchical structure. Therefore, this paper explores the hierarchical multi-label classification in the context of the Arabic language. It proposes a hierarchical multi-label Arabic text classification (HMATC) model with a machine learning approach. The impact of feature selection methods and feature set dimensions on classification performance are also investigated. In addition, the Hierarchy Of Multilabel ClassifiER (HOMER) algorithm is optimized via examination of different sets of multi-label classifiers, clustering algorithms and different numbers of clusters to improve the hierarchical classification. Moreover, this study contributes to existing research by introducing a hierarchical multi-label Arabic dataset in an appropriate format for hierarchical classification and making it publicly available. The results reveal that the proposed model outperforms all models considered in the experiments in terms of the computational cost, which consumed less cost (2 h) compared with other evaluated models. In addition, it shows a significant improvement compared with the state-of-the-art model (Fatwa model) in terms of Hamming loss (0.004), hierarchical loss (1.723), multi-label accuracy (0.758), subset accuracy (0.292), micro-averaged precision (0.879), micro-averaged recall (0.828), and micro-averaged F-measure (0.853).

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

HMATC: Hierarchical multi-label Arabic text classification model using machine learning

Abstract

Published Version

Talk to us

Similar Papers

More From: Egyptian Informatics Journal

Lead the way for us

Similar Papers

A new sentence embedding framework for the education and professional training domain with application to hierarchical multi-label text classification
Guillaume Lefebvre ... Matthieu Sonnati
Data & Knowledge Engineering | VOL. 150
Guillaume Lefebvre, et. al.Guillaume Lefebvre ... Matthieu Sonnati
19 Jan 2024
Data & Knowledge Engineering | VOL. 150

Hybrid embedding-based text representation for hierarchical multi-label text classification
Yinglong Ma ... Beihong Jin
Expert Systems with Applications | VOL. 187
Yinglong Ma, et. al.Yinglong Ma ... Beihong Jin
20 Sep 2021
Expert Systems with Applications | VOL. 187

F-HMTC: Detecting Financial Events for Investment Decisions Based on Neural Hierarchical Multi-Label Text Classification
Xin Liang ... Yifeng Luo
-
Xin Liang, et. al.Xin Liang ... Yifeng Luo
01 Jul 2020
01 Jul 2020

Hierarchical Multi-label Text Classification Method Based On Multi-level Decoupling
Qingwu Fan ... Changsheng Qiu
-
Qingwu Fan, et. al.Qingwu Fan ... Changsheng Qiu
24 Feb 2023
24 Feb 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

HMATC: Hierarchical multi-label Arabic text classification model using machine learning

Abstract

Published Version

Talk to us

Similar Papers

More From: Egyptian Informatics Journal