Abstract

Recurrent neural networks (RNNs) have gained tremendous popularity in almost every sequence modeling task. Despite the effort, these kinds of discrete unstructured data, such as texts, audio, and videos, are still difficult to be embedded in the feature space. Studies in improving the neural networks have accelerated since the introduction of more complex or deeper architectures. The improvements of previous methods are highly dependent on the model at the expense of huge computational sources. However, few of them pay attention to the algorithm. In this article, we bridge the Taylor series with the construction of RNN. Training RNN can be considered as a parameter estimate for the Taylor series. However, we found that there is a discrete term called the remainder in the finite Taylor series that cannot be optimized using gradient descent, which is part of the reason for the truncation error and the model falling into the local optimal solution. To address this, we propose a training algorithm that estimates the range of remainder and introduces the remainder obtained by sampling in this continuous space into the RNN to assist in optimizing the parameters. Notably, the performance of RNN can be improved without changing the RNN architecture in the testing phase. We demonstrate that our approach is able to achieve state-of-the-art performance in action recognition and cross-modal retrieval tasks.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call