Text classification algorithm based on sparse distributed representation

Yaxin Ran,Hongqi Han

doi:10.1109/aeeca49918.2020.9213479

Yaxin Ran, Hongqi Han

https://doi.org/10.1109/aeeca49918.2020.9213479

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

The effect of automatic text classification depends on training data to a great extent. However, the actual data often contains noise. It is often difficult, expensive or time consuming to improve the quality of data without noise at all. Aiming at this problem, a novel text classification algorithm is proposed based on sparse distributed representation (SDR) which is extremely tolerant to noise. The algorithm first created class-SDR for each class label by merging category feature vectors with the subsample technique. Then, the algorithm assigns a class label for a document by comparing the overlap value of SDR with class-SDRs. The experimental results show that the algorithm has better performance in classification with noise training data compared with six frequently used text classification algorithms.

Full Text