The need for reliable and flexible wireless networks has significantly increased in recent years, according to the growing reliance of an enormous number of devices on these networks to establish communications and access service. Mobile Ad-hoc Networks (MANETs) allow the wireless network to establish communications without the need for infrastructure by allowing the nodes to deliver each other’s packets to their destination. Such networks increased flexibility but require more-complex routing methods. In this study, we proposed a new routing method, based on Deep Reinforcement Learning (DRL), that distributes the computations in a Software Defined Network (SDN) controller and the nodes, so that, no redundant computations are executed in the nodes to save the limited resources available on these nodes. The proposed method has been able to significantly increase the lifetime of the network, while maintaining a high Packet Delivery Rate (PDR) and throughput. The results also show that the End-to-End delay of the proposed method is slightly larger than existing routing methods, according to the need for longer alternative routes to balance the loading among the nodes of the MANET.