Multi-Label Annotation and Classification of Arabic Texts Based on Extracted Seed Keyphrases and Bi-Gram Alphabet Feed Forward Neural Networks Model

Fatma Elghannam

doi:10.1145/3539607

Abstract

In natural language processing, text classification is a fundamental problem. Multi-label classification of textual data is a challenging topic in text classification where an instance can be associated with more than one label. This paper presents a multi-label annotation and classification methodology for Arabic text data that is not currently classified as multi-label, aiming to analyze and compare the performance of various multi-label learning approaches. The current work includes two phases: The first involves automatic annotation of hotel reviews with more than one label based on the aspects found in the reviews. In this phase, review data instances were automatically annotated as multi-label based on the extracted seed keyphrases clusters. The second phase involves experiments to compare the performance of various multi-label classification learning methods. In this phase, we introduced different models including a feed-forward networks model that learns a vector representation based on the bi-gram alphabet rather than the commonly used bag-of-words model. The bi-gram alphabet vector representation model has the advantage of having reduced feature dimensions and not requiring natural language processing tools. The results indicated that employing the bi-gram alphabet vector representation feed forward neural network is a competitive solution for the multi-label text classification problem. It has achieved an accuracy of about 75.2%, and standard deviation (0.062).

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Multi-Label Annotation and Classification of Arabic Texts Based on Extracted Seed Keyphrases and Bi-Gram Alphabet Feed Forward Neural Networks Model

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Asian and Low-Resource Language Information Processing

Lead the way for us

Journal: ACM Transactions on Asian and Low-Resource Language Information Processing	Publication Date: Nov 25, 2022
Citations: 2

Similar Papers

EnML: Multi-label Ensemble Learning for Urdu Text Classification
Faiza Mehmood ... Hina Ghafoor
ACM Transactions on Asian and Low-Resource Language Information Processing | VOL. 22
Faiza Mehmood, et. al.Faiza Mehmood ... Hina Ghafoor
22 Sep 2023
ACM Transactions on Asian and Low-Resource Language Information Processing | VOL. 22

Secure multi-label data classification in cloud by additionally homomorphic encryption
Yi Liu ... Xingxin Li
Information sciences | VOL. 468
Yi Liu, et. al.Yi Liu ... Xingxin Li
04 Aug 2018
Information sciences | VOL. 468

Multi-label Classification for Clinical Text with Feature-level Attention
Disheng Pan ... Mengya Li
-
Disheng Pan, et. al.Disheng Pan ... Mengya Li
01 May 2020
01 May 2020

Multi-Label Arabic Text Classification: An Overview
Nawal Aljedani ... Mounira Taileb
International Journal of Advanced Computer Science and Applications | VOL. 11
Nawal Aljedani, et. al.Nawal Aljedani ... Mounira Taileb
01 Jan 2020
International Journal of Advanced Computer Science and Applications | VOL. 11

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multi-Label Annotation and Classification of Arabic Texts Based on Extracted Seed Keyphrases and Bi-Gram Alphabet Feed Forward Neural Networks Model

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Asian and Low-Resource Language Information Processing