Wireless body sensor networks are intelligent enough to efficiently sense the signals for vital parameters of the patient, which aids in offering a better healthcare facility to the patients. Wearable bio-sensors with networking capability have led to the possibility of implementing WBSN and thus promising health care facility can be offered to the community with this upcoming technology. These WBSN basically consist of few sensors or nodes that observe the vital parameters of the patient and communicate them to the required destination with the help of the intermediate nodes, through the best possible paths. This paper proposes a Cluster-based routing Protocol using reinforcement learning with Q-Learning approach to achieve optimal route. Simulations are carried with a set of biomedical sensors covering an area of 1000 × 1000 m2. The simulation is carried out for … seconds. The reinforcement algorithm has been found to route the packets faster when compared with other algorithms.