Abstract

A sensor computes a state estimate of a closed loop linear control system. The state estimate is packetized and sent to the controller in the receiver block over a randomly time-varying (fading) packet dropping link. The receiver sends an ACK/NACK packet to the transmitter over a perfect feedback channel. The energy used in packet transmission depletes a battery of limited capacity at the sensor. The battery is replenished by an energy harvester, which has access to a source of everlasting but random harvested energy. Further, the energy harvesting and the fading channel gain processes are described as finite-state Markov chain models. The objective is to design an optimal energy allocation policy at the transmitter and an optimal control policy at the receiver so that an average infinite horizon linear quadratic Gaussian (LQG) control cost is minimised. It is shown that a separation principle holds, the optimal controller is linear, the Kalman filter at the sensor is optimal, and the optimal energy allocation policy at the transmitter can be obtained via solving the Bellman dynamic programming equation to a Markov decision process based stochastic control problem. A Q-learning algorithm is used to approximate the optimal energy allocation policy. Numerical simulations illustrate that the dynamic programming based policies outperform the simple heuristic policies.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call