Q-learning Approach Research Articles

In this publication we introduce MAM-STM, a software to autonomously manipulate arbitrary moieties towards specific positions on a metal surface utilizing the tip of a scanning tunneling microscope (STM). Finding the optimal manipulation parameters for a specific moiety is challenging and time consuming, even for human experts. MAM-STM combines autonomous data acquisition with a sophisticated Q-learning implementation to determine the optimal bias voltage, the z-approach distance, and the tip position relative to the moiety. This then allows to arrange single molecules and atoms at will. In this work, we provide a tutorial based on a simulated response to offer a comprehensive explanation on how to use and customize MAM-STM. Additionally, we assess the performance of the machine learning algorithm by benchmarking it within a simulated stochastic environment. PROGRAM SUMMARYProgram title: MAM-STMCPC Library link to program files: https://doi.org/10.17632/gtf3bt4v47.1Developer's repository link: https://gitlab.tugraz.at/software_public/mam_stm.gitLicensing provisions: GNU General Public License 3 (GPL)Programming language: Python 3Nature of problem: Achieving precise control over the arrangement of individual molecules on surfaces is essential for advancing nanofabrication and understanding molecular interaction processes. While self-assembly offers a method for forming nanostructures, achieving arbitrary arrangements of moieties remains difficult. Current approaches, such as scanning probe microscopy (SPM), require extensive manual intervention and precise control is difficult to achieve consistently due to the stochastic nature of quantum mechanical systems at the nanoscale. Thus, learning to manipulate several moieties in order to build even relatively small structures is challenging and time consuming and the automation through conventional expert systems is hindered by the lack of prior knowledge about the surface-moiety interaction processes.Solution method: This scenario is ideal for machine learning algorithms, like reinforcement learning (RL), which do not require an underlying model but are able to autonomously learn the optimal manipulation parameters by performing manipulations directly at the machine. Introducing MAM-STM, which stands for Molecular and Atomic Manipulation via Scanning Tunneling Microscopy. MAM-STM allows to control molecules and atoms by learning the manipulation parameters for either vertical or lateral manipulations. However, the vast number of manipulation parameter combinations and the inefficient learning procedure of RL agents exhibit several challenges. MAM-STM overcomes these challenges with an autonomous masking routine that eliminates manipulation parameters that induce structural changes to the moiety or lift it off the surface. Additionally, a sophisticated Q-learning approach is developed that speeds up the learning procedure, enabling molecular manipulations within one day of training.

Read full abstract

PurposeThe purpose of the paper is to propose and demonstrate a novel approach for addressing the challenges of path planning and obstacle avoidance in the context of mobile robots (MR). The specific objectives and purposes outlined in the paper include: introducing a new methodology that combines Q-learning with dynamic reward to improve the efficiency of path planning and obstacle avoidance. Enhancing the navigation of MR through unfamiliar environments by reducing blind exploration and accelerating the convergence to optimal solutions and demonstrating through simulation results that the proposed method, dynamic reward-enhanced Q-learning (DRQL), outperforms existing approaches in terms of achieving convergence to an optimal action strategy more efficiently, requiring less time and improving path exploration with fewer steps and higher average rewards.Design/methodology/approachThe design adopted in this paper to achieve its purposes involves the following key components: (1) Combination of Q-learning and dynamic reward: the paper’s design integrates Q-learning, a popular reinforcement learning technique, with dynamic reward mechanisms. This combination forms the foundation of the approach. Q-learning is used to learn and update the robot’s action-value function, while dynamic rewards are introduced to guide the robot’s actions effectively. (2) Data accumulation during navigation: when a MR navigates through an unfamiliar environment, it accumulates experience data. This data collection is a crucial part of the design, as it enables the robot to learn from its interactions with the environment. (3) Dynamic reward integration: dynamic reward mechanisms are integrated into the Q-learning process. These mechanisms provide feedback to the robot based on its actions, guiding it to make decisions that lead to better outcomes. Dynamic rewards help reduce blind exploration, which can be time-consuming and inefficient and promote faster convergence to optimal solutions. (4) Simulation-based evaluation: to assess the effectiveness of the proposed approach, the design includes a simulation-based evaluation. This evaluation uses simulated environments and scenarios to test the performance of the DRQL method. (5) Performance metrics: the design incorporates performance metrics to measure the success of the approach. These metrics likely include measures of convergence speed, exploration efficiency, the number of steps taken and the average rewards obtained during the robot’s navigation.FindingsThe findings of the paper can be summarized as follows: (1) Efficient path planning and obstacle avoidance: the paper’s proposed approach, DRQL, leads to more efficient path planning and obstacle avoidance for MR. This is achieved through the combination of Q-learning and dynamic reward mechanisms, which guide the robot’s actions effectively. (2) Faster convergence to optimal solutions: DRQL accelerates the convergence of the MR to optimal action strategies. Dynamic rewards help reduce the need for blind exploration, which typically consumes time and this results in a quicker attainment of optimal solutions. (3) Reduced exploration time: the integration of dynamic reward mechanisms significantly reduces the time required for exploration during navigation. This reduction in exploration time contributes to more efficient and quicker path planning. (4) Improved path exploration: the results from the simulations indicate that the DRQL method leads to improved path exploration in unknown environments. The robot takes fewer steps to reach its destination, which is a crucial indicator of efficiency. (5) Higher average rewards: the paper’s findings reveal that MR using DRQL receive higher average rewards during their navigation. This suggests that the proposed approach results in better decision-making and more successful navigation.Originality/valueThe paper’s originality stems from its unique combination of Q-learning and dynamic rewards, its focus on efficiency and speed in MR navigation and its ability to enhance path exploration and average rewards. These original contributions have the potential to advance the field of mobile robotics by addressing critical challenges in path planning and obstacle avoidance.

Read full abstract

Q-learning Approach Research Articles

Related Topics

Articles published on Q-learning Approach

Output feedback fault-tolerant Q-learning for discrete-time linear systems with actuator faults

A Vision-based Robotic Navigation Method Using an Evolutionary and Fuzzy Q-Learning Approach

A scalable Deep Q-Learning approach for hot stamping process under dynamic control environment

A reinforcement learning approach to solving very-short term train rescheduling problem for a single-track rail corridor

Calibration Method for Relativistic Navigation System Using Parallel Q-Learning Extended Kalman Filter.

Heat Conduction Control Using Deep Q-Learning Approach with Physics-Informed Neural Networks

Optimal Electric Vehicle Battery Management Using Q-learning for Sustainability

A Q-Learning Approach for Optimizing the Impact of Musical Education Using Virtual Reality and Social Robots

MAM-STM: A software for autonomous control of single moieties towards specific surface positions

Minimizing Negative Side Effects in Cooperative Multi-Agent Systems using Distributed Coordination

Social-Aware Peer Selection for Energy Efficient D2D Communications in UAV-Assisted Networks: A Q-Learning Approach

A Subgraph-Based Hierarchical Q-Learning Approach to Optimal Resource Scheduling for Complex Industrial Networks

Multiagent Q-Learning Approach for the Recharging Scheduling of Electric Automated Guided Vehicles in Container Terminals

Model-Free Q-Learning for the Tracking Problem of Linear Discrete-Time Systems.

Predictive mobility and cost-aware flow placement in SDN-based IoT networks: a Q-learning approach

A dynamic reward-enhanced Q-learning approach for efficient path planning and obstacle avoidance in mobile robotics

A deep Q-learning approach to optimize ordering and dynamic pricing decisions in the presence of strategic customers

Influence Maximization in Dynamic Networks Using Reinforcement Learning

Joint Content Caching, Recommendation, and Transmission for Layered Scalable Videos Over Dynamic Cellular Networks: A Dueling Deep Q-Learning Approach

Spectrum Efficient Mode Selection and Resource Allocation Optimization for D2D Communication in HetNet: A Multi-Agent Q-Learning Approach

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Q-learning Approach Research Articles

Related Topics

Articles published on Q-learning Approach

Output feedback fault-tolerant Q-learning for discrete-time linear systems with actuator faults

A Vision-based Robotic Navigation Method Using an Evolutionary and Fuzzy Q-Learning Approach

A scalable Deep Q-Learning approach for hot stamping process under dynamic control environment

A reinforcement learning approach to solving very-short term train rescheduling problem for a single-track rail corridor

Calibration Method for Relativistic Navigation System Using Parallel Q-Learning Extended Kalman Filter.

Heat Conduction Control Using Deep Q-Learning Approach with Physics-Informed Neural Networks

Optimal Electric Vehicle Battery Management Using Q-learning for Sustainability

A Q-Learning Approach for Optimizing the Impact of Musical Education Using Virtual Reality and Social Robots

MAM-STM: A software for autonomous control of single moieties towards specific surface positions

Minimizing Negative Side Effects in Cooperative Multi-Agent Systems using Distributed Coordination

Social-Aware Peer Selection for Energy Efficient D2D Communications in UAV-Assisted Networks: A Q-Learning Approach

A Subgraph-Based Hierarchical Q-Learning Approach to Optimal Resource Scheduling for Complex Industrial Networks

Multiagent Q-Learning Approach for the Recharging Scheduling of Electric Automated Guided Vehicles in Container Terminals

Model-Free Q-Learning for the Tracking Problem of Linear Discrete-Time Systems.

Predictive mobility and cost-aware flow placement in SDN-based IoT networks: a Q-learning approach

A dynamic reward-enhanced Q-learning approach for efficient path planning and obstacle avoidance in mobile robotics

A deep Q-learning approach to optimize ordering and dynamic pricing decisions in the presence of strategic customers

Influence Maximization in Dynamic Networks Using Reinforcement Learning

Joint Content Caching, Recommendation, and Transmission for Layered Scalable Videos Over Dynamic Cellular Networks: A Dueling Deep Q-Learning Approach

Spectrum Efficient Mode Selection and Resource Allocation Optimization for D2D Communication in HetNet: A Multi-Agent Q-Learning Approach