In this paper, we detail the technical development of a conversation design that is sensitive to group dynamics and adaptable, taking into account the subtleties of linguistic variations between dyadic (i.e., one human and one agent) and group interactions in human–robot interaction (HRI) using the German language as a case study. The paper details the implementation of robust person and group detection with YOLOv5m and the expansion of knowledge databases using large language models (LLMs) to create adaptive multi-party interactions (MPIs) (i.e., group–robot interactions (GRIs)). We describe the use of LLMs to generate training data for socially interactive agents including social robots, as well as a self-developed synthesis tool, knowledge expander, to accurately map the diverse needs of different users in public spaces. We also outline the integration of a LLM as a fallback for open-ended questions not covered by our knowledge database, ensuring it can effectively respond to both individuals and groups within the MPI framework.
Read full abstract