Document Relevance Ranking Research Articles

Abstract The first major use of natural language processing techniques in the European patent office (EPO) is described. This relates to automating the task of initially classifying newly filed applications with sufficient accuracy to enable reliable routing to the examiner(s) who work in the appropriate technical areas. Precision levels of the order of 80% are required. To achieve this, matters like recall levels, the problems of rarely occurring technical fields, the options for `training material' for the software––using existing fully classified documents, the accuracy of OCR scans of the incoming applications, the use of full texts or just abstracts, and confidence levels for the results are considered. The results are presented in relation to their level of success in precision and recall at various organisational levels at the EPO, i.e. at the highest (cluster) level, at directorate, and technical examiner levels. As another measure of applicability, confusion matrices are also presented. The authors also outline some of the other potential uses of categorisation and linguistic techniques within the work of the EPO, such as routing and partial classifying of both patent and non-patent literature, identifying potentially relevant citations, extracting bibliographic data of patents cited in incoming applications, document-relevance ranking systems and the creation of cross-lingual dictionaries.

Document Relevance Ranking Research Articles

Related Topics

Articles published on Document Relevance Ranking

Text-mining assisted regulatory annotation

Automatic categorisation applications at the European patent office

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Document Relevance Ranking Research Articles

Related Topics

Articles published on Document Relevance Ranking

Text-mining assisted regulatory annotation

Automatic categorisation applications at the European patent office