Abstract Background Psychological autopsy is essential to establish theories, explore trends and identify previously unexplored psychosocial risk factors in suicide research. However, qualitative research has been scrutinized for being prone to interpretation bias, problems with accuracy, challenges to reproducibility, and is very time- and cost intensive. The current study aimed to investigate if a Large Language Model can achieve sufficient agreement with a researcher in the deductive coding of interview data obtained in a psychological autopsy study of suicide, to be integrated with qualitative research procedures. Methods Data from 38 interviews was deductively coded by a researcher and a LLAMA 3 based language model. The model performance was evaluated in four increasingly difficult coding tasks, including binary classification and data summarization. Intercoder agreement scores were calculated using Cohen’s Kappa. Results The preliminary results showed that the LLM achieved substantial agreement with the human coders for the binary classification task (.78). The variability in performance was influenced by code definitions. The results of the quality of LLM interpretation and summarization will also be presented. Conclusions State-of-the-art LLM can be easily integrated into the qualitative analyses of psychological autopsy interviews and may improve real time monitoring of suicides. We recommend a human-AI collaborative model, whereby deductive coding by the LLM is complemented by human inductive coding and further interpretation. Key messages • Integrating an LLM with qualitative research procedures is feasible in a collaborative model. • Integrating an LLM with qualitative research procedures allows near real-time monitoring based on qualitative data, which extends to other public health fields.
Read full abstract