Abstract

We consider the sensor scheduling problem for remote state estimation under integrity attacks. We seek to optimize a trade-off between the energy consumption of communications and the state estimation error covariance when the acknowledgment (ACK) information, sent by the remote estimator to the local sensor, is compromised. The sensor scheduling problem is formulated as an infinite horizon discounted optimal control problem with infinite states. We first analyze the underlying Markov decision process (MDP) and show that the optimal scheduling without ACK attack is of the threshold type. Thus, we can simplify the problem by replacing the original state space with a finite state space. For the simplified MDP, when the ACK is under attack, the problem is modeled as a partially observable Markov decision process (POMDP). We analyze the induced MDP that uses a belief vector as its state for the POMDP. We investigate the properties of the exact optimal solution via contractive models and show that the threshold type of solution for the POMDP cannot be readily obtained. A suboptimal solution is then obtained via a rollout approach, which is a prominent class of reinforcement learning (RL) methods based on approximation in value space. We present two variants of rollout and provide performance bounds of those variants. Finally, numerical examples are used to demonstrate the effectiveness of the proposed rollout methods by comparing them with a finite history window approach that is widely used in RL for POMDP.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call