This paper explores the target localization problem with signal transmitters which are powered by energy harvesting (EH) devices. Due to the remote transmission and random energy harvesting, the energy supplied to transmitters to transmit information is often insufficient, resulting in packet dropout. The rate of packet dropouts is influenced mainly by the distance from the target to the transmitter and transmission energy. Therefore, this paper aims to investigate the energy allocation policies to ensure the desired positioning performance using a two-level hierarchical framework. Initially, a lower-level policy works on a specified time scale and aims at minimize error covariances with prescribed distance to meet the step objective. Subsequently, a higher-level policy plans the step time scale and calls for the desired step objectives for the lower-level policy. Both policy levels are trained through deep reinforcement learning. Finally, an example is presented to illustrate the efficiency of the designed strategy.