Abstract
We review two forms of immediate reward reinforcement learning: in the first of these, the learner is a stochastic node while in the second the individual unit is deterministic but has stochastic synapses. We illustrate the first method on the problem of Independent Component Analysis. Four learning rules have been developed from the second perspective and we investigate the use of these learning rules to perform linear projection techniques such as principal component analysis, exploratory projection pursuit and canonical correlation analysis. The method is very general and simply requires a reward function which is specific to the function we require the unit to perform. We also discuss how the method can be used to learn kernel mappings and conclude by illustrating its use on a topology preserving mapping.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have