Abstract

A new type of information-theoretic method is proposed to improve prediction performance in supervised learning. The method has two main technical features. First, the complicated procedures used to increase information content are replaced by the direct use of hidden neuron outputs. Information is controlled by directly changing the outputs of the hidden neurons. In addition, to simultaneously increase information content and decrease errors between targets and outputs, the information acquisition and use phases are separated. In the information acquisition phase, the autoencoder tries to acquire as much information content on input patterns as possible. In the information use phase, information obtained in the acquisition phase is used to train supervised learning. The method is a simplified version of actual information maximization and directly deals with the outputs from neurons. The method was applied to the three data sets, namely, Iris, bankruptcy, and rebel participation data sets. Experimental results showed that the proposed simplified information acquisition method was effective in increasing the real information content. In addition, by using the information content, generalization performance was greatly improved.

Highlights

  • The Informax principle holds that living systems try to maximize information content at every stage of information processing

  • The proposed procedure is composed of two steps, namely, realization of information maximization by directly controlling hidden neuron outputs and the separation of the information acquisition and use phases

  • Much importance is placed on neurons with larger variances. This direct use of outputs can facilitate the process of information maximization and reduce computational complexity

Read more

Summary

Introduction

In other words, living systems should acquire as much information as possible in order to maintain their existence Following this principle, there have been many attempts to use information-theoretic methods in neural networks [5,6,7,8,9]. Information or entropy functions require complex learning formulas This suggests that information-theoretic methods can be effective only for the relatively small sized neural networks. The information acquisition and use phases are separated This is because it has been difficult to increase information maximization and achieve error minimization at the same time. Information content is used to train supervised neural networks This eliminates the contradiction between information maximization and error minimization in the same learning processes. The effectiveness of separation has been demonstrated in the field of deep learning [16,17,18,19]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call