Q-learning Method Research Articles

In this article there has been considered a modern method of machine learning, which is called reinforcement learning. In tasks, that are solved based on interaction, is often impractical to try to get the desired behavior examples of an intellectual software agent, that would be both correct and appropriate for all situations, since the uncertainty conditions exist, arising from incomplete information about an environment and possible actions of other bots or humans. Therefore, the software agent should be trained on the basis of its own experience. An important advantage of the reinforcement learning is the possibility of learning a bot «from scratch» by the balanced combination (search of the compromise) of the «exploration» — «exploitation» modes and learning of the strategies, which allow to sacrifice some scores at this stage for the sake of greater benefit in the future. Researches in the field of the reinforcement learning can be considered as a part of the overall process, that developed over a last few years. It consists of an interaction of an artificial intelligence and other engineering disciplines that is why reinforcement learning develops ideas drawn from the optimal control theory, stochastic optimization and approximation, following common and ambitious goals of the artificial intelligence. In this work there has been presented the mathematical apparatus of reinforcement learning with the usage of the model-free Q-learning method, practical aspects of its application have been shown, also an effective strategy for the bot learning in an artificial environment (computer video game) has been developed. The role of the observed object variables is accepted by the information used by the agent, and the hidden variables are long-term estimates of the benefit it gains. Depending on the current status of the environment and bot activities is calculated the benefit function, which is received by the agent at the next time moment. With the usage of the developed software, experimental researches of the considered method have been performed. The optimal setting parameters, curves and time learning of the bot have been obtained. The research results may be useful for computer systems of various functional purposes; they can be used in modeling and design, in automatic control and decision making systems, in robotics, in stock markets, etc.

With the emergence of big data computing and analysis, cloud computing services become more and more popular, which has recently drawn researchers’ great attentions to develop various new applications and mechanisms. In this paper, we consider the on-demand mechanism design in the infrastructure as a service (IaaS), including resource allocation and pricing issues under dynamic scenarios. Most of existing works on mechanism design assumed static and independent individual utility, while the cloud computing services are provided in a dynamic environment. To solve such problems, we start with analyzing the Google cluster-usage dataset to draw the statistical and stochastic characteristics of the IaaS consumers and providers. Based on the characteristics mined from real data, we propose a stochastic matching algorithm with Markov Decision Process (MDP), which aims at optimizing the long-term system efficiency, with its online version using Q-learning method to address the imperfect model estimation problem. We further design an efficient (EF), incentive compatible (IC), individual rational (IR) auction mechanism, which is an extension of traditional Vickrey-Clarke-Groves (VCG) mechanism. The proposed mechanism is studied under two application scenario: quality sensitive services, where unilateral MDP-VCG auction is implemented; and quality insensitive services, where MDP-VCG double auction is implemented. To verify the performance of our proposed mechanism, we conduct experiment using the Google dataset and show that the proposed MDP-based VCG auction mechanism can achieve EF, IC and IR properties simultaneously.

Q-learning Method Research Articles

Related Topics

Articles published on Q-learning Method

A Cascaded Algorithm Incorporating Knowledge Transfer Q-learning and Interior Point Method for Coordinated Operation of Integrated Energy System

Application of Q-learning based on adaptive greedy considering negative rewards in football match system

Application of Q-learning based on adaptive greedy considering negative rewards in football match system

DQN Inspired Joint Computing and Caching Resource Allocation Approach for Software Defined Information-Centric Internet of Things Network

A Novel Nested Q-Learning Method to Tackle Time-Constrained Competitive Influence Maximization

Safe Q-Learning Method Based on Constrained Markov Decision Processes

Аналіз та експериментальне дослідження методу безмодельного навчання з підкріпленням

Path planning of a mobile robot in a free-space environment using Q-learning

A CPSS-Based Network Resource Optimization Mechanism for Wireless Heterogeneous Networks

A Proactive Decision Support System for Online Event Streams

A Q-learning approach based on human reasoning for navigation in a dynamic environment

Deep Reinforcement Learning for Mobile Video Offloading in Heterogeneous Cellular Networks

Off-Policy Interleaved Q -Learning: Optimal Control for Affine Nonlinear Discrete-Time Systems.

Data-Driven Auction Mechanism Design in IaaS Cloud Computing

Energy Efficiency-Delay Tradeoff in Energy-Harvesting-Based D2D Communication: An Experimental Learning Approach

Personalized Prediction of Asthma Severity and Asthma Attack for a Personalized Treatment Regimen.

Robust learning in expert networks: a comparative analysis

Optimal Output Regulation for Model-Free Quanser Helicopter With Multistep Q-Learning

Model‐free optimal tracking control for discrete‐time system with delays using reinforcement Q ‐learning

Self-Tuning Method for Increased Obstacle Detection Reliability Based on Internet of Things LiDAR Sensor Models.

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Q-learning Method Research Articles

Related Topics

Articles published on Q-learning Method

A Cascaded Algorithm Incorporating Knowledge Transfer Q-learning and Interior Point Method for Coordinated Operation of Integrated Energy System

Application of Q-learning based on adaptive greedy considering negative rewards in football match system

Application of Q-learning based on adaptive greedy considering negative rewards in football match system

DQN Inspired Joint Computing and Caching Resource Allocation Approach for Software Defined Information-Centric Internet of Things Network

A Novel Nested Q-Learning Method to Tackle Time-Constrained Competitive Influence Maximization

Safe Q-Learning Method Based on Constrained Markov Decision Processes

Аналіз та експериментальне дослідження методу безмодельного навчання з підкріпленням

Path planning of a mobile robot in a free-space environment using Q-learning

A CPSS-Based Network Resource Optimization Mechanism for Wireless Heterogeneous Networks

A Proactive Decision Support System for Online Event Streams

A Q-learning approach based on human reasoning for navigation in a dynamic environment

Deep Reinforcement Learning for Mobile Video Offloading in Heterogeneous Cellular Networks

Off-Policy Interleaved Q -Learning: Optimal Control for Affine Nonlinear Discrete-Time Systems.

Data-Driven Auction Mechanism Design in IaaS Cloud Computing

Energy Efficiency-Delay Tradeoff in Energy-Harvesting-Based D2D Communication: An Experimental Learning Approach

Personalized Prediction of Asthma Severity and Asthma Attack for a Personalized Treatment Regimen.

Robust learning in expert networks: a comparative analysis

Optimal Output Regulation for Model-Free Quanser Helicopter With Multistep Q-Learning

Model‐free optimal tracking control for discrete‐time system with delays using reinforcement Q ‐learning

Self-Tuning Method for Increased Obstacle Detection Reliability Based on Internet of Things LiDAR Sensor Models.