Abstract

This paper reports the use of response surface model (RSM) and reinforcement learning (RL) to solve the travelling salesman problem (TSP). In contrast to heuristically approaches to estimate the parameters of RL, the method proposed here allows a systematic estimation of the learning rate and the discount factor parameters.The Q-learning and SARSA algorithms were applied to standard problems from the TSPLIB library. Computational results demonstrate that the use of RSM is capable of producing better solutions to both symmetric and asymmetric tests of TSP.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call