Consider a learning machine (LM) interacting with an environment epsilon . The environment offers the machine M actions. Traditionally, learning systems endeavor to compute the best action that the environment offers, and this is done without any estimation procedure. In this paper, we consider the problem of the LM computing not only the optimal action offered but also the ordering of the actions in terms of their optimality. The problem is posed in its generality and various norms of learning in this setting are formalized. Also various learning strategies are presented that use a new mathematical model called the random race. In this model the learning is modeled using M racers that are running toward a goal. At each instant, racer R/sub i/ moves toward the goal with a probability of s/sub i/ and stays where he is with a probability of (1-s/sub i/). In the simplest learning model, the learning multiple race track (LMRT) model, the racers run on multiple tracks, and in this scenario, each racer has his own track, thus disallowing interference between the racers. However, in a more general setting, the learning single race track (LSRT) model, the racers run on a single track, and in this case, interferences between racers are specified in terms of overtaking rules. In this paper, we first examine the learning multiple race track (LMRT) model, and we have shown that in the absence of a priori information the LMRT is permutationally epsilon -optimal in all suggestive random environments. Other results are proven or conjectured. >