Abstract
In Device-to-Device (D2D) enabled cellular networks, user-to-network relaying can be handled for improving the performance of cellular networks. When relays are in mobility, a dynamic relay selection strategy is unavoidable. In this paper, we propose a dynamic policy of relay selection that maximizes the performance of cellular networks (e.g. throughput, reliability, coverage) under cost constraints (e.g. transmission power, power budget). We model the relays' dynamics as a Markov Decision Process (MDP). Since only the locations of the selected relays are observed, the sequential relay selection process is formulated as a Constrained Partially Observable Markov Decision Process (CPOMDP). The exact solution of such framework is intractable to find, therefore we prove the submodularity property of the reward and cost functions and deduce a greedy point based value iteration solution. Considering the throughput as reward metric and the energy as cost metric, numerical results are illustrated to endorse the proposed relay selection policy and to show how introducing D2D relaying can highly improve the performance of cellular networks.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have