Abstract

Personalisation has become omnipresent in society. For the domain of health and wellbeing such personalisation can contribute to better interventions and improved health states of users. In order for personalisation to be effective in this domain, it needs to be performed quickly and with minimal impact on the users. Reinforcement learning is one of the techniques that can be used to establish such personalisation, but it is not known to be very fast at learning. Cluster-based reinforcement learning has been proposed to improve the learning speed. Here, users who show similar behaviour are clustered and one policy is learned for each individual cluster. An important factor in this effort is the method used for clustering, which has the potential to influence the benefit of such an approach. In this paper, we propose three distance metrics based on the state of the users (Euclidean distance, Dynamic Time Warping, and high-level features) and apply different clustering techniques given these distance metrics to study their impact on the overall performance. We evaluate the different methods in a simulator with users spawned from very distinct user profiles as well as overlapping user profiles. The results show that clustering configurations using high-level features significantly outperform regular reinforcement learning without clustering (which either learn one policy for all or one policy per individual).

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.