Is your search query well-formed? A natural query understanding for patent prior art search

Renukswamy Chikkamath,Deepak Rastogi,Mahesh Maan,Markus Endres

doi:10.1016/j.wpi.2023.102254

Abstract

Recent advances in Deep Learning based prior art search has enabled the development of easy-to-use prior art search engines that accept natural language search queries and provide improved search performance. However, unlike conventional keyword-based techniques where the results are readily interpreted by the presence of queried keywords, Deep Learning based techniques act like a black box. As a result, it is difficult for users to articulate their information in order to obtain optimal results. In this paper, we share insights on query well-formedness from extensive experimentation with PQAI,11https://projectpq.ai/. an open source Deep Learning based prior art search engine. We study the effects of various query parameters such as grammar, specificity, and verbosity on the search results and show that ill-formed queries containing grammatical errors, non-essential content, and broad terminology adversely affect the relevance of search results. We also develop a number of Machine Learning models, viz. Grammatical Error Detection Model (GEDM), Query Specificity Model (QSM), and Query Verbosity Model (QVM), to identify and mitigate commonly encountered issues with ill-formed queries. The data, survey forms, and code relating to this work will be released to the community22https://github.com/Renuk9390/Query_formedness_PQAI.. Towards future breakthroughs, critical areas of query understanding in prior art search for advancing research are given in the end.

Full Text