Maze Problem Research Articles

Object detectors typically use bounding box regressors to improve the accuracy of object localization. Currently, the two types of bounding box regression loss are $\ell _{n}$ -norm-based and intersection over union ( $IoU$ )-based. However, we found that these two types of losses have their drawbacks. First, for $\ell _{n}$ -norm-based loss, large-scale objects are more likely to obtain a larger penalty than the smaller ones when calculating localization errors, which will cause regression loss imbalance. Second, $\ell _{n}$ -norm-based loss has symmetry so that when the predicted bounding boxes are in some unique symmetrical relationships (i.e., Symmetric Trap), the regression loss remains unchanged. Third, for $IoU$ -based loss, the overlap area and the union area do not change as the shape or relative position of two bounding boxes changes in some cases(i.e., Area Maze). To address these problems, we propose the scale balanced loss( $\mathcal {L}_{SB}$ ), which is asymmetric, position-sensitive, and scale-invariant. First, in order to obtain the property of scale invariance, it is designed as a fraction to eliminate the scale information contained in the numerator and denominator. Second, by incorporating the Euclidean distance between different corner points instead of the area, $\mathcal {L}_{SB}$ is sensitive to the changes of coordinates of any corner point, so as to solve the area maze problem. Finally, by incorporating the diagonals of the overlap and the smallest enclosing rectangle, this fraction is no longer symmetric, thus solving the symmetry trap problem. To validate the proposed algorithm, we have replaced the $\ell _{n}$ -norm-based loss of YOLOv3 and SSD with $\mathcal {L}_{GIoU}$ and $\mathcal {L}_{SB}$ and evaluate their performance on Pascal Visual Object Classes and Microsoft Common Objects in Context benchmarks. The final results show that $\mathcal {L}_{SB}$ has improved their average precisions at different $IoU$ thresholds and scales. We envision that this regression loss can also improve the performance of other visual tasks.

Read full abstract

Because of a powerful temporal-difference (TD) with λ [TD( λ )] learning method, this paper presents a novel n -step adaptive dynamic programming (ADP) architecture that combines TD( λ ) with regular TD learning for solving optimal control problems with reduced iterations. In contrast with a backward view learning of TD( λ ) that is required an extra parameter named eligibility traces to update at the end of each episode (offline training), the new design in this paper has forward view learning, which is updated at each time step (online training) without needing the eligibility trace parameter in various applications without mathematical models. Therefore, the new design is called the online model-free n -step action-dependent (AD) heuristic dynamic programming [NSHDP( λ )]. NSHDP( λ ) has three neural networks: the critic network (CN) with regular one-step TD [TD(0)], the CN with n -step TD learning [or TD( λ )], and the actor network (AN). Because the forward view learning does not require any extra eligibility traces associated with each state, the NSHDP( λ ) architecture has low computational costs and is memory efficient. Furthermore, the stability is proven for NSHDP( λ ) under certain conditions by using Lyapunov analysis to obtain the uniformly ultimately bounded (UUB) property. We compare the results with the performance of HDP and traditional action-dependent HDP( λ ) [ADHDP( λ )] with different λ values. Moreover, a complex nonlinear system and 2-D maze problem are two simulation benchmarks in this paper, and the third one is an inverted pendulum simulation benchmark, which is presented in the supplemental material part of this paper. NSHDP( λ ) performance is examined and compared with other ADP methods.

Read full abstract

Maze Problem Research Articles

Articles published on Maze Problem

Inverse Reinforcement Learning with Agents’ Biased Exploration Based on Sub-Optimal Sequential Action Data

Deep Q-learning with hybrid quantum neural network on solving maze problems

Bayesian reinforcement learning for navigation planning in unknown environments.

Rosenblatt’s First Theorem and Frugality of Deep Learning

Simulation and Implementation of a Mobile Robot Trajectory Planning Solution by Using a Genetic Micro-Algorithm

Controller-Based Interactive Proof to Ensure Trust in Multiagent Systems

Multi-Faceted Decision Making Using Multiple Reinforcement Learning to Reducing Wasteful Actions

Quantum reinforcement learning: the maze problem

Path discovering in maze area using mobile robot

Recombination and Novelty in Neuroevolution: A Visual Analysis

Exploration of Applying Lego NXT and Arduino in Situated Engineering Teaching: A Case Study of a Robotics Contest at King Saud University

A learning search algorithm with propagational reinforcement learning

FAOD: Fast Automatic Option Discovery in Hierarchical Reinforcement Learning

A Scale Balanced Loss for Bounding Box Regression

Analyzing trajectories of learning processes through behaviour-based entropy

Online Model-Free n-Step HDP With Stability Analysis.

Parallel Accelerated Virtual Physarum Lab Based on Cellular Automata Agents

Adaptive reinforcement learning with active state-specific exploration for engagement maximization during simulated child-robot interaction

Multi-Agent Cooperation Based on Reinforcement Learning with Internal Reward in Maze Problem

Planning and navigation as active inference

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Maze Problem Research Articles

Articles published on Maze Problem

Inverse Reinforcement Learning with Agents’ Biased Exploration Based on Sub-Optimal Sequential Action Data

Deep Q-learning with hybrid quantum neural network on solving maze problems

Bayesian reinforcement learning for navigation planning in unknown environments.

Rosenblatt’s First Theorem and Frugality of Deep Learning

Simulation and Implementation of a Mobile Robot Trajectory Planning Solution by Using a Genetic Micro-Algorithm

Controller-Based Interactive Proof to Ensure Trust in Multiagent Systems

Multi-Faceted Decision Making Using Multiple Reinforcement Learning to Reducing Wasteful Actions

Quantum reinforcement learning: the maze problem

Path discovering in maze area using mobile robot

Recombination and Novelty in Neuroevolution: A Visual Analysis

Exploration of Applying Lego NXT and Arduino in Situated Engineering Teaching: A Case Study of a Robotics Contest at King Saud University

A learning search algorithm with propagational reinforcement learning

FAOD: Fast Automatic Option Discovery in Hierarchical Reinforcement Learning

A Scale Balanced Loss for Bounding Box Regression

Analyzing trajectories of learning processes through behaviour-based entropy

Online Model-Free n-Step HDP With Stability Analysis.

Parallel Accelerated Virtual Physarum Lab Based on Cellular Automata Agents

Adaptive reinforcement learning with active state-specific exploration for engagement maximization during simulated child-robot interaction

Multi-Agent Cooperation Based on Reinforcement Learning with Internal Reward in Maze Problem

Planning and navigation as active inference