Delay-Aware Two-Hop Cooperative Relay Communications via Approximate MDP and Stochastic Learning

Vincent K. N. Lau,Rui Wang

doi:10.1109/tit.2013.2279895

Abstract

In this paper, a low-complexity delay-aware cross-layer scheduling algorithm for two-hop relay communication systems is proposed. The complex interactions of the queues at the source node and the M relay nodes (RSs) are modeled as an infinite horizon average reward Markov decision process (MDP), whose state space involves the joint queue state information (QSI) of the queues at the source node and the M RSs as well as the joint channel state information (CSI) of all S-R and R-D links. To address the curse of dimensionality, an equivalent MDP formulation is first proposed, where the system state depends only on global QSI. Furthermore, using approximate MDP and stochastic learning, an auction-based distributed online learning algorithm is derived, where each node iteratively estimates a per-node value function based on real-time observations of the local CSI and local QSI as well as signaling between relays. The combined distributed learning converges almost surely to a global optimal solution for large arrivals. Finally, it is showed by simulations that the proposed scheme achieves significant gain compared with various baselines such as the conventional CSIT-only control and the throughput optimal control (in stability sense).

Full Text