Abstract

Many text mining tasks such as text retrieval, text summarization, and text comparisons depend on the extraction of representative keywords from the main text. Most existing keyword extraction algorithms are based on discrete bag-of-words type of word representation of the text. In this paper, we propose a patent keyword extraction algorithm (PKEA) based on the distributed Skip-gram model for patent classification. We also develop a set of quantitative performance measures for keyword extraction evaluation based on information gain and cross-validation, based on Support Vector Machine (SVM) classification, which are valuable when human-annotated keywords are not available. We used a standard benchmark dataset and a homemade patent dataset to evaluate the performance of PKEA. Our patent dataset includes 2500 patents from five distinct technological fields related to autonomous cars (GPS systems, lidar systems, object recognition systems, radar systems, and vehicle control systems). We compared our method with Frequency, Term Frequency-Inverse Document Frequency (TF-IDF), TextRank and Rapid Automatic Keyword Extraction (RAKE). The experimental results show that our proposed algorithm provides a promising way to extract keywords from patent texts for patent classification.

Highlights

  • Patents are an important part of intellectual property

  • The reliability and performance of subsequent analyses will be affected, which in turn makes it hard to draw reliable insights from analysis results. Considering these issues, this paper examines the effectiveness of deep learning-based keyword extraction methods and proposes a keyword extraction method based on the Skip-gram [20,21,22] model to effectively extract keywords from patent text for patent classification

  • We develop a method to extract representative keywords from patents, which are used as the features of the patent text for high performance classification by Support Vector Machine (SVM) classifiers

Read more

Summary

Introduction

Patents are an important part of intellectual property. Effective patent analysis may bring lots of benefits for the enterprise. Usually automated patent classifiers are applied to a huge number of patent applications, which are inspected by patent examiner to check the proof for the classification to make final classification decision. This is especially true for classification predictions that have low confidence by the classifiers. Due to this special requirement, high-performance patent classifiers that can explain their classification with extracted keywords, ready for quick inspection by the patent examiner, are strongly desirable

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.