With the development of autonomous ships, complete automation of passage planning has become an imminent priority. However, existing A* algorithms have the disadvantage of generating paths close to land because they prioritize minimizing the navigation distance. Therefore, this study proposes a method based on a deep Q network to effectively implement reward-and-penalty strategies considering required navigation areas and non-navigable areas throughout the port-to-port distance. First, the Busan and Gwangyang Ports were selected as the target areas and a container ship was selected as the target ship. Initially, non-navigable and reward areas were designated based on the water depth and electronic navigational chart information. Thereafter, we conducted experiments using algorithms in three types of environments: normal conditions, turbulent weather, and obstacle-involved environments. Furthermore, we employed the Douglas–Peucker algorithm to eliminate excessive waypoints. Experimental results demonstrated that the path planning of a ship obtained using a deep Q network involved more efficient and safer decisions for ship navigation. Furthermore, the navigation distance was reduced by 1.77% compared to the passage plan used by actual ships. The proposed approach is advantageous for automatically deriving the optimal mid-range path of ships and can thus contribute toward improving maritime safety and efficiency.