Policy Compression for Intelligent Continuous Control on Low-Power Edge Devices.

Thomas Avé,Kevin Mets,Tom De Schepper

doi:10.3390/s24154876

Abstract

Interest in deploying deep reinforcement learning (DRL) models on low-power edge devices, such as Autonomous Mobile Robots (AMRs) and Internet of Things (IoT) devices, has seen a significant rise due to the potential of performing real-time inference by eliminating the latency and reliability issues incurred from wireless communication and the privacy benefits of processing data locally. Deploying such energy-intensive models on power-constrained devices is not always feasible, however, which has led to the development of model compression techniques that can reduce the size and computational complexity of DRL policies. Policy distillation, the most popular of these methods, can be used to first lower the number of network parameters by transferring the behavior of a large teacher network to a smaller student model before deploying these students at the edge. This works well with deterministic policies that operate using discrete actions. However, many real-world tasks that are power constrained, such as in the field of robotics, are formulated using continuous action spaces, which are not supported. In this work, we improve the policy distillation method to support the compression of DRL models designed to solve these continuous control tasks, with an emphasis on maintaining the stochastic nature of continuous DRL algorithms. Experiments show that our methods can be used effectively to compress such policies up to 750% while maintaining or even exceeding their teacher's performance by up to 41% in solving two popular continuous control tasks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Policy Compression for Intelligent Continuous Control on Low-Power Edge Devices.

Abstract

Talk to us

Similar Papers

More From: Sensors (Basel, Switzerland)

Lead the way for us

Journal: Sensors (Basel, Switzerland)	Publication Date: Jul 27, 2024
License type: cc-by

Similar Papers

Deep Reinforcement Learning for Automatic Drilling Optimization Using an Integrated Reward Function
Xu Huang ... Ted Furlong
-
Xu Huang, et. al.Xu Huang ... Ted Furlong
27 Feb 2024
27 Feb 2024

Distributed Reinforcement Learning with States Feature Encoding and States Stacking in Continuous Action Space
Tianqi Xu ... Zhiyuan Wang
-
Tianqi Xu, et. al.Tianqi Xu ... Zhiyuan Wang
01 Jan 2020
01 Jan 2020

Action decoupled SAC reinforcement learning with discrete-continuous hybrid action spaces
Yahao Xu ... Hongbin Deng
Neurocomputing | VOL. 537
Yahao Xu, et. al.Yahao Xu ... Hongbin Deng
31 Mar 2023
Neurocomputing | VOL. 537

Human control of an inverted pendulum: Is continuous control necessary? Is intermittent control effective? Is intermittent control physiological?
Ian D Loram ... Martin Lakie
The Journal of Physiology | VOL. 589
Ian D Loram, et. al.Ian D Loram ... Martin Lakie
14 Jan 2011
The Journal of Physiology | VOL. 589

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Policy Compression for Intelligent Continuous Control on Low-Power Edge Devices.

Abstract

Talk to us

Similar Papers

More From: Sensors (Basel, Switzerland)