Abstract
Neural conditioning associates cues and actions with following rewards. The environments in which robots operate, however, are pervaded by a variety of disturbing stimuli and uncertain timing. In particular, variable reward delays make it difficult to reconstruct which previous actions are responsible for following rewards. Such an uncertainty is handled by biological neural networks, but represents a challenge for computational models, suggesting the lack of a satisfactory theory for robotic neural conditioning. The present study demonstrates the use of rare neural correlations in making correct associations between rewards and previous cues or actions. Rare correlations are functional in selecting sparse synapses to be eligible for later weight updates if a reward occurs. The repetition of this process singles out the associating and reward-triggering pathways, and thereby copes with distal rewards. The neural network displays macro-level classical and operant conditioning, which is demonstrated in an interactive real-life human-robot interaction. The proposed mechanism models realistic conditioning in humans and animals and implements similar behaviors in neuro-robotic platforms.
Highlights
In reward learning, the results of actions, manifested as rewards or punishments, occur often seconds after the actions that caused them
The present study demonstrates the use of rare neural correlations in making correct associations between rewards and previous cues or actions
This study demonstrates neural robotic conditioning in humanrobot interactive scenarios with delayed rewards, disturbing stimuli, and uncertain timing
Summary
The results of actions, manifested as rewards or punishments, occur often seconds after the actions that caused them For this reason, it is not always easy to determine which previous stimuli and actions are causally associated with following rewards. It is not always easy to determine which previous stimuli and actions are causally associated with following rewards This problem was named distal reward problem (Hull, 1943), or credit assignment problem (Sutton and Barto, 1998). This problem and the ability of animals to solve it emerged originally in behavioral psychology (Thorndike, 1911; Pavlov, 1927; Skinner, 1953). The ability of determining such relationships is distinctive of human and animal intelligence
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.