Gait design for walking biped robots, that can preserve stability against a known range of disturbances, is very important in real applications. Designing an exponentially stable walking gait with desired features for biped robots has been recently done by an online reinforcement learning method. However, the designed gait might not be robust enough against disturbances. In this paper, we extend a robust version of the method against modeling errors/disturbances. It is done by minimizing the costs of worst rollouts which are generated in the presence of different modeling errors/disturbances. The proposed method's ability to adapt the controller is studied for some robust applications. The simulation shows that the resulted gaits are exponentially stable and robust against modeling errors/disturbances in a feasible range.