A New AI Approach by Acquisition of Characteristics in Human Decision-Making Process

Yuan Zhou,Siamak Khatibi

doi:10.3390/app14135469

Abstract

Planning and decision making are closely interconnected processes that often occur in tandem, influence and informing each other. Planning usually precedes decision making in the chronological sequence, and it can be viewed as a strategy to make decisions. A comprehensive planning or decision strategy can facilitate effective decisions. Thus, understanding and learning human decision-making strategies has drawn intensive attention from the AI community. For example, applying planning algorithms into reinforcement leaning (RL) can simulate the consequence of different actions and select optimal decisions based on learned models, while inverse reinforcement learning (IRL) learns a reward function and policy from expert demonstration and applies them into new scenarios. Most of these methods work based on learning human decision strategies by using modeling of a Markovian decision-making process (MDP). In this paper, we argue that the property of MDP is not fit for human decision-making processes in the real-world and it is insufficient to capture human decision strategies. To tackle this challenge, we propose a new approach to identify the characteristics of human decision-making processes as a decision map, where the decision strategy is defined by the probability distribution of human decisions that are adaptive to the dynamic changes in the environment. The proposed approach was inspired by imitation learning (IL) but with fundamental differences: (a) Instead of aiming to learn an optimal policy based on expert’s demonstrations, we aimed to estimate the distribution of decisions of any group of people. (b) Instead of modeling the environment by an MDP, we used an ambiguity probability model to consider the uncertainty of each decision. (c) The participant trajectory was obtained by categorizing each decision of a participant to a certain cluster based on the commonness in the distribution of decisions. The result shows a feasible way to capture human long-term decision dependency, which provides a complement to the existing machine learning methods for understanding and learning human decision strategies.

Full Text