Abstract

An essential work in natural language processing is the Multi-Label Text Classification (MLTC). The purpose of the MLTC is to assign multiple labels to each document. Traditional text classification methods, such as machine learning usually involve data scattering and failure to discover relationships between data. With the development of deep learning algorithms, many authors have used deep learning in MLTC. In this paper, a novel model called Spotted Hyena Optimizer (SHO)-Long Short-Term Memory (SHO-LSTM) for MLTC based on LSTM network and SHO algorithm is proposed. In the LSTM network, the Skip-gram method is used to embed words into the vector space. The new model uses the SHO algorithm to optimize the initial weight of the LSTM network. Adjusting the weight matrix in LSTM is a major challenge. If the weight of the neurons to be accurate, then the accuracy of the output will be higher. The SHO algorithm is a population-based meta-heuristic algorithm that works based on the mass hunting behavior of spotted hyenas. In this algorithm, each solution of the problem is coded as a hyena. Then the hyenas are approached to the optimal answer by following the hyena of the leader. Four datasets are used (RCV1-v2, EUR-Lex, Reuters-21578, and Bookmarks) to evaluate the proposed model. The assessments demonstrate that the proposed model has a higher accuracy rate than LSTM, Genetic Algorithm-LSTM (GA-LSTM), Particle Swarm Optimization-LSTM (PSO-LSTM), Artificial Bee Colony-LSTM (ABC-LSTM), Harmony Algorithm Search-LSTM (HAS-LSTM), and Differential Evolution-LSTM (DE-LSTM). The improvement of SHO-LSTM model accuracy for four datasets compared to LSTM is 7.52%, 7.12%, 1.92%, and 4.90%, respectively.

Highlights

  • Developed models to Multi-Label Text Classification (MLTC) based on statistical and probabilistic methods depend on several features defined by an expert [1]

  • The results showed that the Root Mean Square Error (RMSE) value in the WT-FS-LSTM-Crow Search Algorithm (CSA) model is equal to 0.1536 and in the WT-FS-LSTM-Particle Swarm Optimization (PSO) model is equal to 0.1621

  • The results showed that the accuracy of the Spotted Hyena Optimizer (SHO)-LSTM model on RCV1-v2, EUR-Lex, Reuters-21578, and Bookmarks were 87.65, 45.91, 63.81, and 42.16, respectively

Read more

Summary

Introduction

Developed models to MLTC based on statistical and probabilistic methods depend on several features defined by an expert [1]. The use of deep learning algorithms is an effective and helpful method for MLTC. Training data are needed in the form of a text dataset, and features are determined automatically and contain in-depth information that is effective for processing machine learning algorithms [3,4]. Deep learning has led to substantial advancements in the field of word processing [5] In this regard, various studies have been conducted based on the use of different deep learning models for text processing, such as Convolutional Neural Networks (CNN) [6], LSTM [7], and Recursive Neural Network (RNN) [8]. An LSTM cell comprises four blocks: the cell state, the input gate, the forget gate, and the output gate These four blocks are used by LSTM to maintain both long-term and short-term reliance on consecutive data. The problem is exacerbated when categories of activation functions are used in layers that reduce these changes, the amount of output change is minimal and increases the number of iterations required to learn the network. (b) Long sequence processing in the LSTM is better processed than RNN due to its strong internal structure

Motivation
Main Contributions
Organization of the Paper
Previous Studies
Related Works
12: Determin9e: tehnedffiotnress of each search agent
Proposed Model
Pre-Processing
Embedding Words
SHO-LSTM
Evaluation and Results
Evaluation Based on Epochs
Evaluation Based on the Number of Iterations
Conclusions and Future Works
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call