Abstract

Neural conditioning associates cues and actions with following rewards. The environments in which robots operate, however, are pervaded by a variety of disturbing stimuli and uncertain timing. In particular, variable reward delays make it difficult to reconstruct which previous actions are responsible for following rewards. Such an uncertainty is handled by biological neural networks, but represents a challenge for computational models, suggesting the lack of a satisfactory theory for robotic neural conditioning. The present study demonstrates the use of rare neural correlations in making correct associations between rewards and previous cues or actions. Rare correlations are functional in selecting sparse synapses to be eligible for later weight updates if a reward occurs. The repetition of this process singles out the associating and reward-triggering pathways, and thereby copes with distal rewards. The neural network displays macro-level classical and operant conditioning, which is demonstrated in an interactive real-life human-robot interaction. The proposed mechanism models realistic conditioning in humans and animals and implements similar behaviors in neuro-robotic platforms.

Highlights

  • In reward learning, the results of actions, manifested as rewards or punishments, occur often seconds after the actions that caused them

  • The present study demonstrates the use of rare neural correlations in making correct associations between rewards and previous cues or actions

  • This study demonstrates neural robotic conditioning in humanrobot interactive scenarios with delayed rewards, disturbing stimuli, and uncertain timing

Read more

Summary

Introduction

The results of actions, manifested as rewards or punishments, occur often seconds after the actions that caused them For this reason, it is not always easy to determine which previous stimuli and actions are causally associated with following rewards. It is not always easy to determine which previous stimuli and actions are causally associated with following rewards This problem was named distal reward problem (Hull, 1943), or credit assignment problem (Sutton and Barto, 1998). This problem and the ability of animals to solve it emerged originally in behavioral psychology (Thorndike, 1911; Pavlov, 1927; Skinner, 1953). The ability of determining such relationships is distinctive of human and animal intelligence

Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.