A Compressive Sensing Model for Speeding Up Text Classification.

Kelin Shen,Peinan Hao,Ran Li

doi:10.1155/2020/8879795

Abstract

Text classification plays an important role in various applications of big data by automatically classifying massive text documents. However, high dimensionality and sparsity of text features have presented a challenge to efficient classification. In this paper, we propose a compressive sensing- (CS-) based model to speed up text classification. Using CS to reduce the size of feature space, our model has a low time and space complexity while training a text classifier, and the restricted isometry property (RIP) of CS ensures that pairwise distances between text features can be well preserved in the process of dimensionality reduction. In particular, by structural random matrices (SRMs), CS is free from computation and memory limitations in the construction of random projections. Experimental results demonstrate that CS effectively accelerates the text classification while hardly causing any accuracy loss.

Highlights

With the advancement of information technology over the last decade, digital resources have penetrated into all fields in our society, generating big data, which present a new challenge to data mining and information retrieval [1]
If the features output by dimensionality reduction (DR) can well preserve their pairwise distances in original space, DR suppresses the loss of training accuracy; we evaluate the effects of structural random matrices (SRMs) on pairwise distances between text features
We develop a compressive sensing- (CS-)based model for text classification tasks

Summary

Introduction

With the advancement of information technology over the last decade, digital resources have penetrated into all fields in our society, generating big data, which present a new challenge to data mining and information retrieval [1]. In ML, many classifiers can be used to classify texts, such as support vector machine (SVM) [10], decision tree [11], adaptive boosting (AdaBoost) [12], K-nearest neighbor (KNN) [13], and Naıve Bayes [14] To train these classifiers, texts must be represented as feature vectors by some feature extraction models, among which the commonest is Bag of Words (BOW) [15]. By being embedded into the neural network, the autoencoder can end up learning a low-dimensional representation very similar to PCAs. Compared with the above-mentioned DR techniques, random projection [23, 24] is a better choice, since it avoids the model training, but it is still a challenge to store random projections due to the huge dimensionality of text feature.

Background

Evaluation metrics

Proposed CS-Based Text Classification

Experimental Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Computational intelligence and neuroscience	Publication Date: Aug 7, 2020
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Compressive Sensing Model for Speeding Up Text Classification.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computational intelligence and neuroscience

Lead the way for us

Similar Papers

FEDRA: A Fast and Efficient Dimensionality Reduction Algorithm
Panagis Magdalinos ... Christos Doulkeridis
-
Panagis Magdalinos, et. al.Panagis Magdalinos ... Christos Doulkeridis
30 Apr 2009
30 Apr 2009

Research On Text Classification Based On Deep Neural Network
Deageon Kim
International Journal of Communication Networks and Information Security (IJCNIS) | VOL. 14
Deageon KimDeageon Kim
31 Dec 2022
International Journal of Communication Networks and Information Security (IJCNIS) | VOL. 14

Tibetan News Text Classification Based on Multi-features Fusion
Yanhua Duan ... Hongzhi Yu
-
Yanhua Duan, et. al.Yanhua Duan ... Hongzhi Yu
29 Oct 2021
29 Oct 2021

DCCL: Dual-channel hybrid neural network combined with self-attention for text classification.
Chaofan Li ... Qiong Liu
Mathematical biosciences and engineering : MBE | VOL. 20
Chaofan Li, et. al.Chaofan Li ... Qiong Liu
01 Jan 2021
Mathematical biosciences and engineering : MBE | VOL. 20

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Compressive Sensing Model for Speeding Up Text Classification.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computational intelligence and neuroscience