Machine Learning for Real-Time Decision Making

Thomas G Dietterich

doi:10.21236/ada388044

Abstract

Abstract : Many problems of interest to the Air Force involve routine sequential decision making under uncertainty. Examples include air traffic control, control of autonomous surveillance aircraft, logistics planning and scheduling, and equipment diagnosis and repair. These kinds of problems can be formulated within the framework of Markov Decision Problems (MDPs) and Partially-Observable Markov Decision Problems (POMDPs). Reinforcement Learning is the study of adaptive methods for solving large MDPs and POMDPs. The research funded under this grant developed a hierarchical approach to solving MDPs, called the MAXQ method, that is much more effective than previous non-hierarchical methods. Theoretical analysis proves that MAXQ converges to the optimal solution. Experimental studies show that it gives very large speedups during learning. A second line of research developed two methods for approximately solving large POMDPs. This research also explored cost-sensitive learning and diagnosis by formulating them as POMDPs and applying specialized reinforcement learning methods to solve them. A third line of research focused on function approximation methods and algorithms for practical reinforcement learning. New representations (based on regression trees and support vector machines) and new algorithms (based on more appropriate objective functions) led to improvements in the quality of solutions and the practical application of reinforcement learning to resource-constrained scheduling problems.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Machine Learning for Real-Time Decision Making

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Contraction Mappings in the Theory Underlying Dynamic Programming
Eric V Denardo
SIAM Review | VOL. 9
Eric V DenardoEric V Denardo
01 Apr 1967
SIAM Review | VOL. 9

Complex-Valued Reinforcement Learning: a Context-Based Approach for POMDPs
Takeshi Shibuya ... Tomoki Hamagami
-
Takeshi Shibuya, et. al.Takeshi Shibuya ... Tomoki Hamagami
14 Jan 2011
14 Jan 2011

A Bayesian game based adaptive fuzzy controller for multiagent POMDPs
Rajneesh Sharma ... Matthijs T J Spaan
-
Rajneesh Sharma, et. al.Rajneesh Sharma ... Matthijs T J Spaan
01 Jul 2010
01 Jul 2010

Generating Reward Functions Using IRL Towards Individualized Cancer Screening.
Panayiotis Petousis ... William Hsu
Artificial intelligence in health : first International Workshop, AIH 2018, Stockholm, Sweden, July 13-14, 2018, Revised selected papers. AIH (Workshop) (1st : 2018 : Stockholm, Sweden) | VOL. 11326
Panayiotis Petousis, et. al.Panayiotis Petousis ... William Hsu
01 Jan 2019
01 Jan 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Machine Learning for Real-Time Decision Making

Abstract

Talk to us

Similar Papers