Abstract

In this paper, we consider a queue-aware distributive power and relay selection control for two-hop cooperative OFDM systems with bursty arrivals. The complex interactions of the queues at the source node and the M relays (RSs) are modeled as an infinite horizon average reward Markov Decision Process (MDP), whose state space involves the joint queue state information (QSI) of the queues at the source node and the M RSs as well as the joint channel state information (CSI) of all S-R links and R-D links. The traditional approach solving this MDP problem involves centralized control with huge complexity. To address the curse of dimensionality, we first propose a equivalent MDP formulation on a reduced state space. We show that the delay-optimal power control (and link selection algorithm), which are functions of both the CSI and QSI, has a multi-level water-filling structure. To obtain a distributive and low complexity solution, we introduce a linear structure which approximates the value function by the sum of per-node potential functions. Furthermore, we derive a distributive stochastic online learning algorithm in which each node recursively estimates the per-node potential functions based on real-time observations of the local CSI and local QSI only. Finally, we show that the combined distributive learning converges almost surely to a global optimal solution for large arrivals.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call