Towards a small language model powered chain‐of‐reasoning for open‐domain question answering

Jihyeon Roh,Minho Kim,Kyoungman Bae

doi:10.4218/etrij.2023-0355

Abstract

AbstractWe focus on open‐domain question‐answering tasks that involve a chain‐of‐reasoning, which are primarily implemented using large language models. With an emphasis on cost‐effectiveness, we designed EffiChainQA, an architecture centered on the use of small language models. We employed a retrieval‐based language model to address the limitations of large language models, such as the hallucination issue and the lack of updated knowledge. To enhance reasoning capabilities, we introduced a question decomposer that leverages a generative language model and serves as a key component in the chain‐of‐reasoning process. To generate training data for our question decomposer, we leveraged ChatGPT, which is known for its data augmentation ability. Comprehensive experiments were conducted using the HotpotQA dataset. Our method outperformed several established approaches, including the Chain‐of‐Thoughts approach, which is based on large language models. Moreover, our results are on par with those of state‐of‐the‐art Retrieve‐then‐Read methods that utilize large language models.

Full Text