Actor-critic Reinforcement Learning Research Articles

The efficient energy management of electric vehicles (EVs) equipped with hybrid energy storage systems (HESS) poses a significant challenge due to its vast search space, numerous control variables, and intricate driving conditions. In response to this challenge, this paper introduces a novel deep reinforcement learning (DRL) algorithm, specifically the soft actor-critic (SAC), tailored for optimizing energy distribution within EVs equipped with battery-supercapacitor (SC) HESS. The proposed SAC-based energy management system (EMS) is designed to address inherent limitations present in most existing DRL algorithms, such as slower convergence rates, discretization errors, unstable training dynamics, and suboptimal optimization effects. The SAC-based EMS undergoes training in a continuous action space through self-play, utilizing a complex driving cycle and a novel reward function. The algorithm demonstrates its efficiency by rapidly maximizing cumulative rewards and refining its decision-making policies. Afterward, extensive experiments showcase the superiority of the proposed SAC-based EMS over rule-based (RB) techniques, deep deterministic policy gradient (DDPG), and battery-only configurations across various driving cycles. Notably, the approach effectively allocates high-power surges and braking energy to the SC, reducing the frequency of charge/discharge cycles and thereby extending the battery's lifespan. Additionally, the learned SAC-based EMS policy achieves substantial reductions in electricity consumption by 39.6 %, 47.87 %, and 45 % compared to battery-only configuration, and 42.56 %, 47.87 %, and 45.83 % compared to RB technique under the NYCC, LA92, and FTP cycles, respectively. In contrast, the DDPG algorithm tends to rely heavily on the SC to deliver the total power required, even under normal driving conditions.

Resource allocation in Narrowband Internet of Things (NB-IoT) networks is a complex challenge due to dynamic user demands, variable channel conditions, and distance considerations. Traditional approaches often struggle to adapt to the dynamic nature of these environments. In this study, we leverage reinforcement learning (RL) to address the intricate nature of NB-IoT resource allocation. Specifically, we employ the Soft Actor–Critic (SAC) algorithm, comparing its performance against conventional RL algorithms such as Deep Q-Network (DQN) and Proximal Policy Optimization (PPO). The Soft Actor–Critic (SAC) algorithm is employed to train an agent for adaptive resource allocation, considering energy efficiency, throughput, latency, fairness, and interference constraints. The agent adeptly balances these objectives through an intricate reward structure and penalty mechanisms. Through comprehensive analysis, we present performance metrics, including total reward, energy efficiency, throughput, fairness, and latency, showcasing the efficacy of SAC when compared to DQN and PPO. Our findings underscore the efficiency of SAC in optimizing resource allocation in NB-IoT networks, offering a promising solution to the complexities inherent in such dynamic environments. Resource allocation in Narrowband Internet of Things (NB-IoT) networks presents a complex challenge due to dynamic user demands, variable channel conditions, and distance considerations. Traditional approaches often struggle to adapt to these dynamic environments. This study leverages reinforcement learning (RL), specifically the Soft Actor–Critic (SAC) algorithm, to address the intricacies of NB-IoT resource allocation. We compare SAC’s performance against conventional RL algorithms, including Deep Q-Network (DQN) and Proximal Policy Optimization (PPO). The SAC algorithm is utilized to train an agent for adaptive resource allocation, focusing on energy efficiency, throughput, latency, fairness, interference constraints, recovery time, and long-term performance stability. To demonstrate the scalability and effectiveness of SAC, we conducted experiments on NB-IoT networks with varying deployment types and configurations, including standard urban and suburban, high-density urban, industrial IoT, rural and low-density, and IoT service providers. To assess generalization capability, we tested SAC across applications like smart metering, smart cities, smart agriculture, and asset tracking & management. Our comprehensive analysis demonstrates that SAC significantly outperforms DQN and PPO across multiple performance metrics. Specifically, SAC improves energy efficiency by 5.60% over PPO and 10.25% over DQN. In terms of latency, SAC achieves a marginal reduction of approximately 0.0124% compared to PPO and 0.0126% compared to DQN. SAC enhances throughput by 214.98% over PPO and 15.72% over DQN. Additionally, SAC shows a substantial increase in fairness (Jain’s index), improving by 358.31% over PPO and 614.46% over DQN. SAC also demonstrates superior recovery time, improving by 18.99% over PPO and 25.07% over DQN. In both deployment scenarios and diverse IoT applications, SAC consistently achieves high total rewards, minimal fluctuations, and stable performance. Energy efficiency remains constant at 7.2 bits per Joule, and latency is approximately 0.080 s. Throughput is robust across different deployments, while fairness remains high, ensuring equitable resource allocation. Recovery times are stable, enhancing operational reliability. These results underscore SAC’s efficiency and robustness in optimizing resource allocation in NB-IoT networks, presenting a promising solution to the complexities of dynamic environments.

Actor-critic Reinforcement Learning Research Articles

Related Topics

Articles published on Actor-critic Reinforcement Learning

Adaptive MPC path-tracking controller based on reinforcement learning and preview-based PID controller

Antagonistic Feedback Control of Muscle Length Changes for Efficient Involuntary Posture Stabilization.

Vision-and-language navigation based on history-aware cross-modal feature fusion in indoor environment

A Robust Mean-Field Actor-Critic Reinforcement Learning Against Adversarial Perturbations on Agent States.

Adaptive fault-tolerant control for spacecraft: A dynamic Stackelberg game approach with advantage actor-critic reinforcement learning

Bayesian quadrature policy optimization for spacecraft proximity maneuvers and docking

A soft actor-critic reinforcement learning framework for optimal energy management in electric vehicles with hybrid storage

A soft actor–critic reinforcement learning approach for over the air active beamforming with reconfigurable intelligent surface

Time-Constrained Actor-Critic Reinforcement Learning for Concurrent Order Dispatch in On-Demand Delivery

Active Queue Management in L4S with Asynchronous Advantage Actor-Critic: A FreeBSD Networking Stack Perspective

Next-gen resource optimization in NB-IoT networks: Harnessing soft actor–critic reinforcement learning

Adaptive Optimal Tracking Control of an Underactuated Surface Vessel Using Actor-Critic Reinforcement Learning.

An efficient and lightweight off-policy actor–critic reinforcement learning framework

Robust-optimization-guiding deep reinforcement learning for chemical material production scheduling

Actor-Critic Reinforcement Learning Algorithms for Mean Field Games in Continuous Time, State and Action Spaces

Expert knowledge data-driven based actor–critic reinforcement learning framework to solve computationally expensive unit commitment problems with uncertain wind energy

Controlling estimation error in reinforcement learning via Reinforced Operation

Energy management strategies based on soft actor critic reinforcement learning with a proper reward function design based on battery state of charge constraints

Reinforcement learning in building controls: A comparative study of algorithms considering model availability and policy representation

Development of an Attention Mechanism for Task-Adaptive Heterogeneous Robot Teaming

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Actor-critic Reinforcement Learning Research Articles

Related Topics

Articles published on Actor-critic Reinforcement Learning

Adaptive MPC path-tracking controller based on reinforcement learning and preview-based PID controller

Antagonistic Feedback Control of Muscle Length Changes for Efficient Involuntary Posture Stabilization.

Vision-and-language navigation based on history-aware cross-modal feature fusion in indoor environment

A Robust Mean-Field Actor-Critic Reinforcement Learning Against Adversarial Perturbations on Agent States.

Adaptive fault-tolerant control for spacecraft: A dynamic Stackelberg game approach with advantage actor-critic reinforcement learning

Bayesian quadrature policy optimization for spacecraft proximity maneuvers and docking

A soft actor-critic reinforcement learning framework for optimal energy management in electric vehicles with hybrid storage

A soft actor–critic reinforcement learning approach for over the air active beamforming with reconfigurable intelligent surface

Time-Constrained Actor-Critic Reinforcement Learning for Concurrent Order Dispatch in On-Demand Delivery

Active Queue Management in L4S with Asynchronous Advantage Actor-Critic: A FreeBSD Networking Stack Perspective

Next-gen resource optimization in NB-IoT networks: Harnessing soft actor–critic reinforcement learning

Adaptive Optimal Tracking Control of an Underactuated Surface Vessel Using Actor-Critic Reinforcement Learning.

An efficient and lightweight off-policy actor–critic reinforcement learning framework

Robust-optimization-guiding deep reinforcement learning for chemical material production scheduling

Actor-Critic Reinforcement Learning Algorithms for Mean Field Games in Continuous Time, State and Action Spaces

Expert knowledge data-driven based actor–critic reinforcement learning framework to solve computationally expensive unit commitment problems with uncertain wind energy

Controlling estimation error in reinforcement learning via Reinforced Operation

Energy management strategies based on soft actor critic reinforcement learning with a proper reward function design based on battery state of charge constraints

Reinforcement learning in building controls: A comparative study of algorithms considering model availability and policy representation

Development of an Attention Mechanism for Task-Adaptive Heterogeneous Robot Teaming