This study explores the potential of Large Language Models (LLMs), like those from the GPT family, to automate inductive qualitative coding—a process of assigning codes to text excerpts and organizing them into categories, traditionally done manually in social science and software engineering research. Our primary question is: Can LLMs effectively automate inductive qualitative coding? To investigate this, we compared different prompt engineering techniques, including Zero-shot, Few-shot, and Chain-of-Thought (CoT) learning, in coding interview transcripts. While LLMs cannot fully replace human coders, they can aid the process with a human-in-the-loop approach. Few-shot learning showed consistent performance with moderate amounts data, while CoT proved most effective in reducing partial hallucinations. Initially aimed at full automation, our study pivoted to testing prompt strategies after realizing that a human-in-the-loop process would offer better accuracy and flexibility, given the challenges of context and token limits in LLMs. These findings suggest that tailored LLM with adequate prompting techniques can help assist researchers when performing qualitative analysis.
Read full abstract