Abstract

One of the essential goals in information retrieval is to bridge the gap between the way users would prefer to specify their information needs and the way queries are required to be expressed. Rule Based Information Retrieval by Computer (RUBRIC) is one of the approaches proposed to achieve this goal. This approach involves the use of production rules to capture user-query concepts (or topics). In RUBRIC, a set of related production rules is represented as an AND/OR tree, or alternatively by a disjunction of Minimal Term Sets (MTSs). The retrieval output is determined by the evaluation of the weighted Boolean expressions of the AND/OR tree, and processing efficiency can be enhanced by employing MTSs. However, since the weighted Boolean expression ignores the term-term association unless it is explicitly represented in the tree, the terminological gap between users' queries and their information needs may still remain. To solve this problem, we adopt the generalized vector space model (GVSM) and the p-norm based extended Boolean model. Experiments are performed for two variations of the RUBRIC model, extended with GVSM, as well as for the integrated use of RUBRIC with the p-norm based extended Boolean model. The results are compared to the original RUBRIC model based on recall-precision.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.