Artificial intelligence (AI) has applications in assisting older adults to age in place and provide support to them and their caregivers as their cognition declines with age. However, effective assessment methods of this technology are needed in order to benchmark their performance and a common set of metrics and evaluation methods would enable such assessments to be compared to one another. To this end, we propose a common framework for human-AI interaction involving care recipients and their care networks. From the results of a literature review exercise, a framework with sample metrics, related measures, qualified evaluation tools, and contextual factors that impact assessment are reviewed. This paper provides a sample of common metrics in one of the framework’s measurement spaces (human-AI interaction) and discusses some of the impacts of contextual factors and how use of the common metrics and evaluation framework can be used for meta-analysis and to guide future research. Additional future articles are planned to cover the other measurement spaces in the framework (system performance, task performance, and well-being), including their particular common metrics and evaluation methods. This effort aims to provide guidance for researchers in this domain as well as highlight measurement gaps that can be filled by future research.
Read full abstract