Fuzzy Local Information C -Means based clustering and Fractional Dwarf Mongoose optimization enabled deep learning for relevant document retrieval

Gunjan Chandwani,Anil Ahlawat

doi:10.1016/j.engappai.2023.106954

Abstract

Document Retrieval (DR) needs an innovative model to rank and retrieve documents based on their relevancy with respect to some questions that requires strong text understanding capability. The prime motive of documents retrieval is to search the relevant documents that satisfy the user's questions. However, it is a complex process because it means the natural language textual content based on the syntax, context and semantics. Conventional techniques for listing files prefer typical word and sentence encrypting to create constant length document abiding. However, the widely used bag-of-words (BoW) method failed to integrate the signify context, which is a crucial problem to comprehend the document-query relevancy. In order to overcome such issues, deep neural networks (DNNs) have been put forward to arrange search outcomes with respect to user's questions. Here, a unified solution is provided to perform relevant document retrieval using Dwarf Mongoose Optimization Fractional-based Deep Convolutional Neural network (DMOF-Deep CNN). Here, the textual content processing is done based on BERT tokenization and feature term extraction. Moreover, the cluster based indexing by elastic search is accomplished using Fuzzy Local Information C-Means (FLICM) clustering and dice coefficient is employed to perform the query matching. Finally, re-ranking based document retrieval is conducted in terms of deep CNN, which is trained using designed DMOF. In addition, the designed DMOF-Deep CNN has outperformed other existing models by delivering maximum precision of 0.854, recall of 0.913, F1-score of 0.882.

Full Text