Abstract

This paper presents the basic content of a Dual policy iteration algorithm (DPIA) based on reinforcement learning and proves the local convergence, global convergence, and optimality of “r – iteration” and “k – iteration”. The process of capturing information by remote sensing device is reversely transformed into the absorption of the captured information to the electric energy carried by the device, and the equivalent electric energy flow model is obtained accordingly. Based on this model, it is further transformed into a nonlinear non-affine model. A discrete-time index-cost function is also proposed based on the quality of information captured by various sensors. Finally, a kind of passive remote sensing device considered a virtual self-powered device, which can detect 16 kinds of information, has been simulated, and the effectiveness of the algorithm can be proved.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call