A Novel Model Combining Transformer and Bi-LSTM for News Categorization

Yuanzhi Liu,Seunggil Jeon,Mengjia Shi,Min He

doi:10.1109/tcss.2022.3223621

Abstract

News categorization (NC), the aim of which is to identify distinct categories of news through analyzing the contents, has acquired substantial progress since deep learning was introduced into the natural language processing (NLP) field. As a state-of-art model, transformer’s classification performance is not satisfied compared with recurrent neural network (RNN) and convolutional neural network (CNN) if it does not get pretrained. Based on the transformer model, this article proposes a novel framework that combines bidirectional long short-term memory (Bi-LSTM) network and transformer to solve this problem. In the suggested framework, the self-attention mechanism is substituted with Bi-LSTM to capture the semantic information from sentences. Meanwhile, an attention mechanism model is applied to focus on those important words and adjust their weights to solve the problem of long-distance information loss. With pooling network, the network complexity can be reduced and the main features can be highlighted by halving the dimension of the hidden state. Finally, after acquiring the hidden representation by the above structures, we utilize a contraction network to further capture the long-range associations from a text. Experiments on three large-scale corpora were performed to evaluate the suggested framework, and the results demonstrate that our model outperforms other models such as deep pyramid CNN (DPCNN), transformer.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Novel Model Combining Transformer and Bi-LSTM for News Categorization

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computational Social Systems

Lead the way for us

Journal: IEEE Transactions on Computational Social Systems	Publication Date: Aug 1, 2024
Citations: 3

Similar Papers

A Bidirectional LSTM approach for written script auto evaluation using keywords-based pattern matching
Prabakaran N ... Vijay Kakani
Natural Language Processing Journal | VOL. 5
Prabakaran N, et. al.Prabakaran N ... Vijay Kakani
15 Sep 2023
Natural Language Processing Journal | VOL. 5

Stacked Convolutional Bidirectional LSTM Recurrent Neural Network for Bearing Anomaly Detection in Rotating Machinery Diagnostics
Kwangsuk Lee ... Jae-Kyeong Kim
-
Kwangsuk Lee, et. al.Kwangsuk Lee ... Jae-Kyeong Kim
01 Jul 2018
01 Jul 2018

Tunnel boring machine vibration-based deep learning for the ground identification of working faces
Mengbo Liu ... Yanqing Men
Journal of Rock Mechanics and Geotechnical Engineering | VOL. 13
Mengbo Liu, et. al.Mengbo Liu ... Yanqing Men
01 Dec 2021
Journal of Rock Mechanics and Geotechnical Engineering | VOL. 13

Comparative Analysis of Deep Learning Approaches for Twitter Text Classification
Lukesh Kadu
INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT | VOL. 06
Lukesh KaduLukesh Kadu
21 Oct 2022
INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT | VOL. 06

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Novel Model Combining Transformer and Bi-LSTM for News Categorization

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computational Social Systems