Abstract

We consider remote control of multiple Markov decision processes (MDP) over a slotted multiple access channel (MAC). A subset of state observations for each process is sent over the MAC to its remote controller which therefore has limited and delayed state observations. We consider remote controllers that use maximum likelihood state estimates to determine control actions at every slot, which are fed back to the MDPs. We investigate the relevance of information freshness metrics in this context and obtain that the distribution of age of information determines the average reward for a class of MDPs with action independent transitions. The distribution of age of information is then characterized for common MAC protocols such as TDMA, ALOHA, and CSMA under a single-cell assumption, which leads to a characterization of the average reward for the remote controller. We observe that an optimal choice of attempt probability which minimizes the average age of information also maximizes the average reward for ALOHA and CSMA. We compare the average reward performance for different MAC protocols as well as for MDPs with different mixing times.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call