Fast Q-Learning for Improved Finite Length Performance of Irregular Repetition Slotted ALOHA

Eleni Nisioti,Nikolaos Thomos

doi:10.1109/tccn.2019.2957224

Abstract

In this paper, we study the problem of designing adaptive Medium Access Control (MAC) solutions for wireless sensor networks (WSNs) under the Irregular Repetition Slotted ALOHA (IRSA) protocol. In particular, we optimize the degree distribution employed by IRSA for finite frame sizes. Motivated by characteristics of WSNs, such as the restricted computational resources and partial observability, we model the design of IRSA as a Decentralized Partially Observable Markov Decision Process (Dec-POMDP). We have theoretically analyzed our solution in terms of optimality of the learned IRSA design and derived guarantees for finding near-optimal policies. These guarantees are generic and can be applied in resource allocation problems that exhibit the waterfall effect , which in our setting manifests itself as a severe degradation in the overall throughput of the network above a particular channel load. Furthermore, we combat the inherent non-stationarity of the learning environment in WSNs by advancing classical Q-learning through the use of virtual experience (VE), a technique that enables the update of multiple state-action pairs per learning iteration and, thus, accelerates convergence. Our simulations confirm the superiority of our learning-based MAC solution compared to traditional IRSA and provide insights into the effect of WSN characteristics on the quality of learned policies.

Highlights

Wireless sensor networks (WSNs) have drawn the attention of the research community due to their wide applicability and the challenges inherent in their optimization
The large number of sensors and partial observability may lead to an explosion in the complexity of learning, which we remedy by employing two techniques: (i) adopting finite histories of observations to approximate the continuous beliefs of Belief MDPs, which significantly reduces the size of the state space and, as we prove in Section VII-A, can still lead to policies with near-optimal performance, and (ii) assuming that each sensor learns independently from other sensors, by updating its local Q-function based on its individual observations and actions
We examine the performance of the proposed scheme in simulations of various settings and compare the derived distributions with those used by classical Irregular Repetition Slotted ALOHA (IRSA);

Summary

INTRODUCTION

Wireless sensor networks (WSNs) have drawn the attention of the research community due to their wide applicability and the challenges inherent in their optimization. Ensuring realistic complexity is essential when designing solutions for WSNs. The large number of sensors and partial observability may lead to an explosion in the complexity of learning, which we remedy by employing two techniques: (i) adopting finite histories of observations to approximate the continuous beliefs of Belief MDPs, which significantly reduces the size of the state space and, as we prove in Section VII-A, can still lead to policies with near-optimal performance, and (ii) assuming that each sensor learns independently from other sensors, by updating its local Q-function based on its individual observations and actions. Q-learning will exhibit sub-optimal performance, if the environment changes at a rate quicker than its convergence rate To address this issue, our solution equips Q-learning with the concept of virtual experience (VE) [8], where an agent updates multiple stateactions pairs at each Q-learning iteration by “imagining” state visits.

RELATED WORK

Q-LEARNING IN WSNS

Physical layer

Buffer and traffic model

PROBLEM FORMULATION

Optimization objective

Modeling as MDP

Dec-POMDP Formulation

Dealing with partial observability

Learning in a Dec-POMDP framework

Virtual experience

Optimality analysis

Rate of convergence analysis

Computational complexity

VIII. SIMULATIONS

Protocol Comparison

Effect of state space size

Waterfall effect

Findings

CONCLUSIONS

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Transactions on Cognitive Communications and Networking	Publication Date: Dec 13, 2019
Citations: 38	License type: cc-by

R Discovery Prime

R Discovery Prime

Fast Q-Learning for Improved Finite Length Performance of Irregular Repetition Slotted ALOHA

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Transactions on Cognitive Communications and Networking

Lead the way for us

Similar Papers

Meta-survey on medium access control surveys in wireless sensor networks
Waseem M Arain ... Sayeed Ghani
International Journal of Distributed Sensor Networks | VOL. 12
Waseem M Arain, et. al.Waseem M Arain ... Sayeed Ghani
01 Aug 2016
International Journal of Distributed Sensor Networks | VOL. 12

Optimizing Spatial and Temporal Reuse in Wireless Networks by Decentralized Partially Observable Markov Decision Processes
Joni Pajarinen ... Ari Hottinen
IEEE transactions on mobile computing | VOL. 13
Joni Pajarinen, et. al.Joni Pajarinen ... Ari Hottinen
01 Apr 2014
IEEE transactions on mobile computing | VOL. 13

CL-IoT: cross-layer Internet of Things protocol for intelligent manufacturing of smart farming
Hemant B Mahajan ... Anil Badarla
Journal of Ambient Intelligence and Humanized Computing | VOL. 12
Hemant B Mahajan, et. al.Hemant B Mahajan ... Anil Badarla
01 Sep 2020
Journal of Ambient Intelligence and Humanized Computing | VOL. 12

A variable damping vibration energy harvester based on Half-Wave flywheeling effect for freight railways
Tingsheng Zhang ... Jinyue Yan
Mechanical Systems and Signal Processing | VOL. 200
Tingsheng Zhang, et. al.Tingsheng Zhang ... Jinyue Yan
03 Aug 2023
Mechanical Systems and Signal Processing | VOL. 200

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Fast Q-Learning for Improved Finite Length Performance of Irregular Repetition Slotted ALOHA

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Transactions on Cognitive Communications and Networking