Improve Sampling Efficiency Research Articles

Deep reinforcement learning (DRL) has shown remarkable capabilities in solving sequential decision-making problems. However, DRL requires extensive interactions with image-based environments. Existing methods have combined self-supervised learning or data augmentation to improve sample efficiency. While understanding the temporal information dynamics of the environment is important for effective learning, many methods do not consider these factors. To address the sample efficiency problem, we propose dynamics modeling with sequential temporal information (DynaSTI) that incorporates environmental dynamics and leverages the correlation among trajectories to improve sample efficiency. DynaSTI uses an effective learning strategy for state representation as an auxiliary task, using gated recurrent units to capture temporal information. It also integrates forward and inverse dynamics modeling in a hierarchical configuration, enhancing the learning of environmental dynamics compared to using each model separately. The hierarchical structure of DynaSTI enhances the stability of inverse dynamics modeling during training by using inputs derived from forward dynamics modeling, which focuses on feature extraction related to controllable state. This approach effectively filters out noisy information. Consequently, using denoised inputs from forward dynamics modeling results in improved stability when training inverse dynamics modeling, rather than using inputs directly from the encoder. We demonstrate the effectiveness of DynaSTI through experiments on the Atari game benchmark, limiting the environment interactions to 100k steps. Our extensive experiments confirm that DynaSTI significantly improves the sample efficiency of DRL, outperforming comparison methods in terms of statistically reliable metrics and nearing human-level performance.

SummaryThe metaverse, a future digital world for living, working, learning, and interacting, is rapidly gaining significance, particularly in the domain of remote work and collaboration. This emerging digital landscape demands high‐performance, low‐latency, and scalable services to provide an immersive user experience. The current technology used in metaverse systems has some limitations, which emphasize the importance of adopting emerging technologies like Edge Computing (EC). However, as the number of users and data volume increases, it can impact both system performance and scalability of the edge‐assisted metaverse system. Additionally, the uneven distribution of edge servers can cause inconsistencies and result in high latency. To overcome these challenges, this paper proposes a dynamic edge clustering and task scheduling approach for edge‐assisted metaverse systems (DTAM) in the field of remote work and collaboration. The proposed approach addresses the challenges of high user volume and uneven resource distribution by incorporating dynamic clustering and edge server assistance to improve clustering performance. Furthermore, a Prioritized Experience Replay‐based Deep Q‐learning algorithm with state augmentation (PERDQSA) for task scheduling is introduced to improve sample efficiency and performance. The performance of the proposed DTAM is evaluated against existing techniques, and experimental results demonstrate its significant superiority in terms of specific metrics such as bandwidth, task response time, energy efficiency, and latency. The experiments demonstrated that DTAM outperforms Transformation‐based Edge Computing Deep Q‐Learning (TransEC‐DQL) in several key metrics. Specifically, DTAM achieves 28.5% reduction in latency, 13.7% reduction in response time, and 6.4% improvement in bandwidth compared to TransEC‐DQL. These results signify that DTAM can deliver a significantly enhanced user experience in the metaverse, particularly in the context of remote work and collaboration.

Improve Sampling Efficiency Research Articles

Articles published on Improve Sampling Efficiency

An Effective Training Method for Counterfactual Multi-Agent Policy Network Based on Differential Evolution Algorithm

A survey of demonstration learning

Self-contrastive Learning-optimized General Agent for long-tailed fault diagnosis of shipboard antennas leveraging adaptive data distribution

Dynamic preference inference network: Improving sample efficiency for multi-objective reinforcement learning by preference estimation

Learning Collective Variables with Synthetic Data Augmentation through Physics-Inspired Geodesic Interpolation.

Discrete Space Deep Reinforcement Learning Algorithm Based on Support Vector Machine Recursive Feature Elimination

Real-Time Path Guiding Using Bounding Voxel Sampling

Temporally extended successor feature neural episodic control

General reinforcement learning control for AUV manoeuvring in turbulent flows

Deep Causal Reasoning for Recommendations

DynaSTI: Dynamics modeling with sequential temporal information for reinforcement learning in Atari

Mitigating Value Hallucination in Dyna-Style Planning via Multistep Predecessor Models

Leveraging SO(3)-steerable convolutions for pose-robust semantic segmentation in 3D medical data

Dynamic edge clustering and task scheduling for edge assisted metaverse system in the field of remote work and collaboration

Implicit Posteriori Parameter Distribution Optimization in Reinforcement Learning.

Mixed experience sampling for off-policy reinforcement learning

Integrating human learning and reinforcement learning: A novel approach to agent training

Unbiasing Enhanced Sampling on a High-Dimensional Free Energy Surface with a Deep Generative Model.

Distributional Off-Policy Evaluation for Slate Recommendations

Contextual Pre-planning on Reward Machine Abstractions for Enhanced Transfer in Deep Reinforcement Learning

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Improve Sampling Efficiency Research Articles

Articles published on Improve Sampling Efficiency

An Effective Training Method for Counterfactual Multi-Agent Policy Network Based on Differential Evolution Algorithm

A survey of demonstration learning

Self-contrastive Learning-optimized General Agent for long-tailed fault diagnosis of shipboard antennas leveraging adaptive data distribution

Dynamic preference inference network: Improving sample efficiency for multi-objective reinforcement learning by preference estimation

Learning Collective Variables with Synthetic Data Augmentation through Physics-Inspired Geodesic Interpolation.

Discrete Space Deep Reinforcement Learning Algorithm Based on Support Vector Machine Recursive Feature Elimination

Real-Time Path Guiding Using Bounding Voxel Sampling

Temporally extended successor feature neural episodic control

General reinforcement learning control for AUV manoeuvring in turbulent flows

Deep Causal Reasoning for Recommendations

DynaSTI: Dynamics modeling with sequential temporal information for reinforcement learning in Atari

Mitigating Value Hallucination in Dyna-Style Planning via Multistep Predecessor Models

Leveraging SO(3)-steerable convolutions for pose-robust semantic segmentation in 3D medical data

Dynamic edge clustering and task scheduling for edge assisted metaverse system in the field of remote work and collaboration

Implicit Posteriori Parameter Distribution Optimization in Reinforcement Learning.

Mixed experience sampling for off-policy reinforcement learning

Integrating human learning and reinforcement learning: A novel approach to agent training

Unbiasing Enhanced Sampling on a High-Dimensional Free Energy Surface with a Deep Generative Model.

Distributional Off-Policy Evaluation for Slate Recommendations

Contextual Pre-planning on Reward Machine Abstractions for Enhanced Transfer in Deep Reinforcement Learning