To meet the 2050 CO2 targets, the shipping industry which is responsible for about 3% of global CO2 emissions needs to be optimized in several aspects. Obviously, alternative fuels constitute the main measure in this respect. However, relatively high fuel prices in combination with increasing political and economic pressure may raise the need for more efficient ship operation. Ship route optimization can make an indispensable contribution to achieving this goal. In this sense, this paper applies an innovative approach for route optimization using Reinforcement Learning (RL). For this purpose, a generic ship model is first developed using Artificial Neural Networks (ANNs) to predict the fuel consumption of the ship. Moreover, various RL methods, namely Deep Q-Network (DQN), Deep Deterministic Policy Gradient (DDPG), and Proximal Policy Optimization (PPO) are applied. The application of RL enables continuous action space and simultaneous optimization of ship speed and heading. DDPG demonstrates the best results as an off-policy and policy gradient method which allows a continuous action space. For example, in the fuel consumption minimization scenario without time limitation, this method can achieve savings of 6.64%. For DQN as a method with discrete action space, this value is 1.07%.