Robots for rescue operations after a disaster are an interesting and challenging research problem that has the potential to save lives and reduce economic losses after a disaster. We developed TOPTWR, an extension of the popular TOPTW model, to model the issues in task allocation for teams of rescue robots. Our hybrid algorithm is based on a team of heterogeneous humanoid robots trying to optimize five objectives (task rewards, task completion time, total energy, maximum energy consumption for a single robot, and missed deadline penalties). A common approach to solve these kinds of problems are multi-objective evolutionary algorithms (MOEAs), but their major disadvantage is that they cannot deal with dynamic environments easily. This paper presents an efficient solution for TOPTWR by combining MOEAs with learning algorithms. A novel Extended Multi-Start Simulated Annealing Iterated Local Search (EMSAILS) operator using a modern state-of-the-art NSGA-III algorithm is proposed. In addition, we applied Q-Learning to learn the likely changes in the environment and how to react to them. This algorithm, HMO-TOPTWR-NSGA-III (HMO-N-L), uses an artificial neural network (ANN) as a function approximator to make the huge state and action spaces tractable. This paper includes a thorough empirical evaluation demonstrating the effectiveness of the multi-objective algorithm in both static and dynamic environments. The evaluation shows that the proposed algorithm reduces the error by up to 42% against three state-of-the-art approaches to TOPTW (HMO-N, MSA, and IPI).