State Representation Learning Research Articles

Although object detection has achieved significant progress in the past decade, detecting small objects is still far from satisfactory due to the high variability of object scales and complex backgrounds. The common way to enhance small object detection is to use high-resolution (HR) images. However, this method incurs huge computational resources which grow squarely with the resolution of images. To achieve both accuracy and efficiency, we propose a novel reinforcement learning framework that employs an efficient policy network consisting of a Spatial Transformation Network to enhance the state representation learning and a Transformer model with early convolution to improve feature extraction. Our method has two main steps: (1) coarse location query (CLQ), where an RL agent is trained to predict the locations of small objects on low-resolution (LR) (down-sampled version of HR) images; (2) context-sensitive object detection where HR image patches are used to detect objects on the selected coarse locations and LR image patches on background areas (containing no small objects). In this way, we can obtain high detection performance on small objects while avoiding unnecessary computation on background areas. The proposed method has been tested and benchmarked on various datasets. On the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Caltech Pedestrians Detection and <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Web Pedestrians datasets, the proposed method improves the detection accuracy by 2%, while reducing the number of processed pixels. On the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Vision meets Drone object detection dataset and the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Oil and Gas Storage Tank dataset, the proposed method outperforms the state-of-the-art (SotA) methods. On <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">MS COCO mini-val set, our method outperforms SotA methods on small object detection, while also achieving comparable performance on medium and large objects.

Scaling end-to-end learning to control robots with vision inputs is a challenging problem in the field of deep reinforcement learning (DRL). While achieving remarkable success in complex sequential tasks, vision-based DRL remains extremely data-inefficient, especially when dealing with high-dimensional pixels inputs. Many recent studies have tried to leverage state representation learning (SRL) to break through such a barrier. Some of them could even help the agent learn from pixels as efficiently as from states. Reproducing existing work, accurately judging the improvements offered by novel methods, and applying these approaches to new tasks are vital for sustaining this progress. However, the demands of these three aspects are seldom straightforward. Without significant criteria and tighter standardization of experimental reporting, it is difficult to determine whether improvements over the previous methods are meaningful. For this reason, we conducted ablation studies on hyperparameters, embedding network architecture, embedded dimension, regularization methods, sample quality and SRL methods to compare and analyze their effects on representation learning and reinforcement learning systematically. Three evaluation metrics are summarized, including five baseline algorithms (including both value-based and policy-based methods) and eight tasks are adopted to avoid the particularity of each experiment setting. We highlight the variability in reported methods and suggest guidelines to make future results in SRL more reproducible and stable based on a wide number of experimental analyses. We aim to spur discussion about how to assure continued progress in the field by minimizing wasted effort stemming from results that are non-reproducible and easily misinterpreted.

State Representation Learning Research Articles

Related Topics

Articles published on State Representation Learning

Reinforcement Learning-Based Multimodal Model for the Stock Investment Portfolio Management Task

Robust Visual Imitation Learning with Inverse Dynamics Representations

Robot skill learning and the data dilemma it faces: a systematic review

On learning latent dynamics of the AUG plasma state

Enhancing Representation Learning With Spatial Transformation and Early Convolution for Reinforcement Learning-Based Small Object Detection

Deep variational Luenberger-type observer with dynamic objects channel-attention for stochastic video prediction

Solving Partially Observable 3D-Visual Tasks with Visual Radial Basis Function Network and Proximal Policy Optimization

Invariant Representations Learning with Future Dynamics

Improving Deep Reinforcement Learning With Mirror Loss

Monocular vision guided deep reinforcement learning UAV systems with representation learning perception

Analysis of Conventional Feature Learning Algorithms and Advanced Deep Learning Models

Enhancing Visual Domain Randomization with Real Images for Sim-to-Real Transfer

DiffSRL: Learning Dynamical State Representation for Deformable Object Manipulation With Differentiable Simulation

Exploratory State Representation Learning.

Masked Contrastive Representation Learning for Reinforcement Learning.

An Experimental Study on State Representation Extraction for Vision-Based Deep Reinforcement Learning

Robotic Manipulation with Reinforcement Learning, State Representation Learning, and Imitation Learning (Student Abstract)

Learning a State Representation and Navigation in Cluttered and Dynamic Environments

Cooperative zone-based rebalancing of idle overhead hoist transportations using multi-agent reinforcement learning with graph representation learning

State Representation Learning With Adjacent State Consistency Loss for Deep Reinforcement Learning

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

State Representation Learning Research Articles

Related Topics

Articles published on State Representation Learning

Reinforcement Learning-Based Multimodal Model for the Stock Investment Portfolio Management Task

Robust Visual Imitation Learning with Inverse Dynamics Representations

Robot skill learning and the data dilemma it faces: a systematic review

On learning latent dynamics of the AUG plasma state

Enhancing Representation Learning With Spatial Transformation and Early Convolution for Reinforcement Learning-Based Small Object Detection

Deep variational Luenberger-type observer with dynamic objects channel-attention for stochastic video prediction

Solving Partially Observable 3D-Visual Tasks with Visual Radial Basis Function Network and Proximal Policy Optimization

Invariant Representations Learning with Future Dynamics

Improving Deep Reinforcement Learning With Mirror Loss

Monocular vision guided deep reinforcement learning UAV systems with representation learning perception

Analysis of Conventional Feature Learning Algorithms and Advanced Deep Learning Models

Enhancing Visual Domain Randomization with Real Images for Sim-to-Real Transfer

DiffSRL: Learning Dynamical State Representation for Deformable Object Manipulation With Differentiable Simulation

Exploratory State Representation Learning.

Masked Contrastive Representation Learning for Reinforcement Learning.

An Experimental Study on State Representation Extraction for Vision-Based Deep Reinforcement Learning

Robotic Manipulation with Reinforcement Learning, State Representation Learning, and Imitation Learning (Student Abstract)

Learning a State Representation and Navigation in Cluttered and Dynamic Environments

Cooperative zone-based rebalancing of idle overhead hoist transportations using multi-agent reinforcement learning with graph representation learning

State Representation Learning With Adjacent State Consistency Loss for Deep Reinforcement Learning