Abstract: The development of Situational Judgment Tests (SJTs) is a multi-step procedure that often requires substantial resources and time. ChatGPT as a large language model has already proven helpful for many complex tasks, including the development of psychological questionnaires. However, ChatGPT has so far not been tested for its ability to develop SJT items, which could speed up SJT development and thus contribute to the dissemination of psychometrically well-designed SJTs in practice and also support research requiring specific SJTs or SJT versions. Thus, the current study ( N = 419) examined whether ChatGPT (2023, Feb 13 version using ChatGPT-3.5) can create SJT items, assessing a facet of personality (i.e., gregariousness), that show similar psychometric properties (examined through confirmatory factor, reliability, and correlational analyses) as human-created SJT items. Results revealed that the measurement model for all eleven SJT items that were created by ChatGPT showed a poor model fit, while a reduced set of eight items yielded an acceptable fit. Reliability estimates as well as convergent and discriminant validity evidence were similar for the human-created and the ChatGPT-created SJT. Implications as well as potential boundary conditions of SJT development through ChatGPT are discussed.
Read full abstract