Retrieval-augmented generation (RAG) addresses the problem of knowledge cutoff and overcomes the inherent limitations of pre-trained language models by retrieving relevant information in real time. However, challenges related to efficiency and accuracy persist in current RAG strategies. A key issue is how to select appropriate methods for user queries of varying complexity dynamically. This study introduces a novel adaptive retrieval-augmented generation framework termed Layered Query Retrieval (LQR). The LQR framework focuses on query complexity classification, retrieval strategies, and relevance analysis, utilizing a custom-built training dataset to train smaller models that aid the large language model (LLM) in efficiently retrieving relevant information. A central technique in LQR is a semantic rule-based approach to distinguish between different levels of multi-hop queries. The process begins by parsing the user’s query for keywords, followed by a keyword-based document retrieval. Subsequently, we employ a natural language inference (NLI) model to assess whether the retrieved document is relevant to the query. We validated our approach on multiple single-hop and multi-hop datasets, demonstrating significant improvements in both accuracy and efficiency compared to existing single-step, multi-step, and adaptive methods. Our method exhibits high accuracy and efficiency, particularly on the HotpotQA dataset, where it outperforms the Adaptive-RAG method by improving accuracy by 9.4% and the F1 score by 16.14%. The proposed approach carefully balances retrieval efficiency with the accuracy of the LLM’s responses.
Read full abstract