To enhance the reliability of the document retrieval system, the most efficient techniques such as Query Expansion (QE) are utilized. It has offered more adequate queries for the user when assimilated over original or initial queries by adding up one or more expansion keywords. Moreover, these techniques are more effective to enhance the performance of document retrieval and return the unnecessary information. In recent times, searching the suitable documents in the huge datasets is tiresome work. Generally, the automatic QE is used to address the refining query. A typical technique for QE has included the extracted close expression and the related documents clustering by utilizing the clusters. However, classical clustering poses some issues to QE. Hence, a novel optimized bi-clustering mechanism is proposed in this paper for patent retrieval by QE. The ultimate aim of this implemented model is to retrieve the patent information by expanding the request query. Initially, the patent-related data is collected from standard data sources in terms of abstract and text. It is then given to the text pre-processing stage. Consequently, the pre-processed text or word is converted into vector formation by using the Multi-cascade Transformer Network (MTN). Finally, the retrieval process is done by proposing the Optimal Bi-Clustering (OptBi-C) process, in which the parameters are optimally determined by a hybrid algorithm of Reptile Search Algorithm (RSA) and Lion Algorithm (LA) termed as Iteration-based Reptile Search and Lion Algorithm (IRSLA). Thus, the performance of the model is examined with certain metrics and compared with traditional techniques. The precision of the implemented patent retrieval system using the QE model is maximized by 8.82% of DHOA-OptBi-C, 7.35% of HHO-OptBi-C, 10.29% of RSA-OptBi-C, and 7.35% of LA-OptBi-C respectively when the number of retrieved data is 10. Moreover, the recall of the designed patent retrieval system using the QE model is enhanced by 21.83% of KNN, 24.13% of CNN, 19.54% of FUZZY, and 11.49% of Bi-clustering respectively when the number of retrieved data is 6. Thus, the findings demonstrate that the system improves the retrieval performance.
Read full abstract