Abstract

The authors consider the problem of a learning machine (LM) interacting with a random environment which offers the LM M actions. They endeavor to obtain machines which yield an ordering of the individual actions in terms of their optimality. The problem is posed in its generality and a formal solution provided using a new mathematical model called the random race. In the simplest learning model, the learning multiple race track (LMRT) model, each racer runs on his own track, thus disallowing interference between the racers. However, in a more interactive setting, the learning single race track (LSRT) model, the racers run on a single track, and in this case, interferences between racers are specified in terms of overtaking rules. The order in which the racers reach the goal is defined as the ordering to which the learning machine converges. The LMRT scheme is examined, and various results are derived for the case when no a priori information is utilized and for the case when the LMRT uses a priori information by giving the racers handicaps which are either uniformly or geometrically distributed. Analogous results for the LSRT are conjectured. >

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.