Abstract

We consider remote control of multiple Markov decision processes (MDP) over a slotted multiple access channel (MAC). A subset of state observations for each process is sent over the MAC to its remote controller which therefore has limited and delayed state observations. We consider remote controllers that use maximum likelihood state estimates to determine control actions at every slot, which are fed back to the MDPs. We investigate the relevance of information freshness metrics in this context and obtain that the distribution of age of information determines the average reward for a class of MDPs with action independent transitions. The distribution of age of information is then characterized for common MAC protocols such as TDMA, ALOHA, and CSMA under a single-cell assumption, which leads to a characterization of the average reward for the remote controller. We observe that an optimal choice of attempt probability which minimizes the average age of information also maximizes the average reward for ALOHA and CSMA. We compare the average reward performance for different MAC protocols as well as for MDPs with different mixing times.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.