Abstract

Text classification is a fundamental language task in Natural Language Processing. A variety of sequential models are capable of making good predictions, yet there is a lack of connection between language semantics and prediction results. This paper proposes a novel influence score (I-score), a greedy search algorithm, called Backward Dropping Algorithm (BDA), and a novel feature engineering technique called the “dagger technique”. First, the paper proposes to use the novel influence score (I-score) to detect and search for the important language semantics in text documents that are useful for making good predictions in text classification tasks. Next, a greedy search algorithm, called the Backward Dropping Algorithm, is proposed to handle long-term dependencies in the dataset. Moreover, the paper proposes a novel engineering technique called the “dagger technique” that fully preserves the relationship between the explanatory variable and the response variable. The proposed techniques can be further generalized into any feed-forward Artificial Neural Networks (ANNs) and Convolutional Neural Networks (CNNs), and any neural network. A real-world application on the Internet Movie Database (IMDB) is used and the proposed methods are applied to improve prediction performance with an 81% error reduction compared to other popular peers if I-score and “dagger technique” are not implemented.

Highlights

  • Artificial Neural Networks (ANNs) are created using many layers of fully-connected units that are called artificial neurons

  • Though this paper focuses on Recurrent Neural Networks (RNNs), similar designs can be carried out implementing the Influence Score (I-score) with other types of neural networks

  • Though this paper investigates the Internet Movie Database (IMDB) Movie Dataset, the proposed methods’ I-score, Backward Dropping Algorithm, and the Dagger Technique have been adapted in other data sets, such as Chest X-ray Images, to produce the state-of-the-art results at 99.7% AUC on the held-out test set while using only 12,000–15,000 trainable parameters in the proposed neural network architecture, which is almost a 98% reduction in the number of trainable weights comparing with peers [24,25]

Read more

Summary

Overview

Artificial Neural Networks (ANNs) are created using many layers of fully-connected units that are called artificial neurons. An important roadblock is the optimization difficulties caused by the non-linearity at each layer. Due to this nature, not many significant advances can be achieved before 2006 [2,3]. A family of ANNs that have recurrent connections is called Recurrent Neural Networks (RNNs). These network architectures are designed to model sequential data for sequence recognition, classification, and prediction [6].

Problems in RNN
Problems in Text Classification Using RNN
Performance Diagnosis Test
Remark
Contributions
Organization of Paper
A Novel Influence Measure
An Interaction-Based Feature
Discretization
Backward Dropping Algorithm
A Toy Example
Why Is I-score the Best Candidate?
Non-Parametric Nature
High I-score Produces High AUC Values
I-Score and the “Dagger Technique”
N-gram
Recurrent Neural Network
Backward Propagation Using Gradient Descent
Implementation with I-Score
IMDB Dataset
Results
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call