PatentNet: multi-label classification of patent documents using deep learning based language understanding

Arousha Haghighian Roudsari,Suan Lee,Jafar Afshar,Wookey Lee

doi:10.1007/s11192-021-04179-4

Arousha Haghighian Roudsari, Suan Lee + Show 2 more

Open Access

https://doi.org/10.1007/s11192-021-04179-4

Copy DOI

Journal: Scientometrics	Publication Date: Dec 18, 2021
Citations: 26	License type: open-access

Affiliation: Inha University, Semyung University

Abstract

Patent classification is an expensive and time-consuming task that has conventionally been performed by domain experts. However, the increase in the number of filed patents and the complexity of the documents make the classification task challenging. The text used in patent documents is not always written in a way to efficiently convey knowledge. Moreover, patent classification is a multi-label classification task with a large number of labels, which makes the problem even more complicated. Hence, automating this expensive and laborious task is essential for assisting domain experts in managing patent documents, facilitating reliable search, retrieval, and further patent analysis tasks. Transfer learning and pre-trained language models have recently achieved state-of-the-art results in many Natural Language Processing tasks. In this work, we focus on investigating the effect of fine-tuning the pre-trained language models, namely, BERT, XLNet, RoBERTa, and ELECTRA, for the essential task of multi-label patent classification. We compare these models with the baseline deep-learning approaches used for patent classification. We use various word embeddings to enhance the performance of the baseline models. The publicly available USPTO-2M patent classification benchmark and M-patent datasets are used for conducting experiments. We conclude that fine-tuning the pre-trained language models on the patent text improves the multi-label patent classification performance. Our findings indicate that XLNet performs the best and achieves a new state-of-the-art classification performance with respect to precision, recall, F1 measure, as well as coverage error, and LRAP.

Highlights

Patent documents contain valuable information and, if scrutinized, can reveal substantial technical details, inspire new industrial solutions, depict leading business trends, and assist in making critical investment decisions (Zhang et al, 2015)
(2) We demonstrated the superiority of the pre-trained language models by comparing them with other deep learning-based models proposed for the patent classification problem
Patents are classified based on some complex hierarchical standard taxonomies that are prone to change as time goes on

Summary

Introduction

Patent documents contain valuable information and, if scrutinized, can reveal substantial technical details, inspire new industrial solutions, depict leading business trends, and assist in making critical investment decisions (Zhang et al, 2015). The rapid growth and development in different technology areas have led to a significant increase in patent applications in recent years This increase in the number of patent documents makes patent analysis and management more complicated and time-consuming for patent experts and examiners, posing significant challenges for many patent information users (Yun & Geum, 2020; Chen et al, 2020b). Common standard classification structures such as the International Patent Classification (IPC) or Cooperative Patent Classification (CPC) are used for patent classification (Shalaby & Zadrozny, 2019) These standard taxonomies consist of complex hierarchical structures that cover all technology areas and help maintain inter-operability among various patent offices worldwide (Gomez & Moens, 2014). Accurate automated classification of patent documents is critical and will help experts manage patent documents, facilitate reliable patent search and retrieval, reduce the risk of missing a relevant patent in preventing patent infringement, and further patent analysis tasks (Yun & Geum, 2020; Souza et al, 2020; Gomez & Moens, 2014)

Objectives

Methods

Results

Conclusion