Reinforcement Learning Based Path Exploration for Sequential Explainable Recommendation

Yicong Li,Lin Li,Yile Li,Guandong Xu,Philip S Yu,Hongxu Chen

doi:10.1109/tkde.2023.3237741

Abstract

Recent advances in path-based explainable recommendation systems have attracted increasing attention thanks to the rich information from knowledge graphs. Most existing explainable recommendations only utilize static knowledge graphs and ignore the dynamic user-item evolutions, leading to less convincing and inaccurate explanations. Although some works boost the performance and explainability of recommendations through modeling the user's temporal sequential behavior, most of them either only focus on modeling the user's sequential interactions within a path or independently and separately of the recommendation mechanism. Moreover, some path-based explainable recommendations use random selection or traditional machine learning methods to decrease the volume of explainable paths, which cannot guarantee high quality of the explainable paths for the recommendation. To deal with the problem, recent path exploration use reinforcement learning to improve diversity and quality. However, unsupervised training leads to low-efficiency path exploration. Therefore, we propose a novel <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">T emporal <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">M eta-path Guided <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">E xplainable <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">R ecommendation leveraging <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">R einforcement <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">L earning ( <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">TMER-RL ), which utilizes supervised reinforcement learning to explore item-item paths between consecutive items with attention mechanisms to sequentially model dynamic user-item evolutions on a dynamic knowledge graph for the explainable recommendation. Extensive evaluations of TMER-RL on two real-world datasets show state-of-the-art performance compared to recent strong baselines.

Full Text