Abstract

We consider the problem of a learning mechanism (robot, or algorithm) that learns a parameter while interacting with either a stochastic teacher or a stochastic compulsive liar. The problem is modeled as follows: the learning mechanism is trying to locate an unknown point on a real interval by interacting with a stochastic environment through a series of guesses. For each guess the environment (teacher) essentially informs the mechanism, possibly erroneously, which way it should move to reach the point. Thus, there is a non-zero probability that the feedback from the environment is erroneous. When the probability of correct response is p>0.5, the environment is said to be Informative, and we have the case of learning from a stochastic teacher. When this probability is p<0.5 the environment is deemed Deceptive, and is called a stochastic compulsive liar.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call