Leveraging Large Language Models for Automating Inductive Qualitative Coding: A Comparative Study of Prompt Engineering Techniques

Elias Frigui

doi:10.24908/iqurcp18054

Elias Frigui

https://doi.org/10.24908/iqurcp18054

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

This study explores the potential of Large Language Models (LLMs), like those from the GPT family, to automate inductive qualitative coding—a process of assigning codes to text excerpts and organizing them into categories, traditionally done manually in social science and software engineering research. Our primary question is: Can LLMs effectively automate inductive qualitative coding? To investigate this, we compared different prompt engineering techniques, including Zero-shot, Few-shot, and Chain-of-Thought (CoT) learning, in coding interview transcripts. While LLMs cannot fully replace human coders, they can aid the process with a human-in-the-loop approach. Few-shot learning showed consistent performance with moderate amounts data, while CoT proved most effective in reducing partial hallucinations. Initially aimed at full automation, our study pivoted to testing prompt strategies after realizing that a human-in-the-loop process would offer better accuracy and flexibility, given the challenges of context and token limits in LLMs. These findings suggest that tailored LLM with adequate prompting techniques can help assist researchers when performing qualitative analysis.

Full Text