On the Potential of Taxonomic Graphs to Improve Applicability and Performance for the Classification of Biomedical Patents

Kai Frerich,Kai Frerich,Sandra Geisler,Robert Farkas,Mark Bukowski,Mark Bukowski

doi:10.3390/app11020690

Kai Frerich, Kai Frerich + Show 4 more

Open Access

PDF Available

https://doi.org/10.3390/app11020690

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

A core task in technology management in biomedical engineering and beyond is the classification of patents into domain-specific categories, increasingly automated by machine learning, with the fuzzy language of patents causing particular problems. Striving for higher classification performance, increasingly complex models have been developed, based not only on text but also on a wealth of distinct (meta) data and methods. However, this makes it difficult to access and integrate data and to fuse distinct predictions. Although the already established Cooperate Patent Classification (CPC) offers a plethora of information, it is rarely used in automated patent categorization. Thus, we combine taxonomic and textual information to an ensemble classification system comparing stacking and fixed combination rules as fusion methods. Various classifiers are trained on title/abstract and on both the CPC and IPC (International Patent Classification) assignments of 1230 patents covering six categories of future biomedical innovation. The taxonomies are modeled as tree graphs, parsed and transformed by Dissimilarity Space Embedding (DSE) to real-valued vectors. The classifier ensemble tops the basic performance by nearly 10 points to F1 = 78.7% when stacked with a feed-forward Artificial Neural Network (ANN). Taxonomic base classifiers perform nearly as well as the text-based learners. Moreover, an ensemble only of CPC and IPC learners reaches F1 = 71.2% as fully language independent and straightforward approach of established algorithms and readily available integrated data enabling new possibilities for technology management.

Highlights

The analysis of patents is one of the core duties of technology and innovation management with varying purposes and perspectives, such as forecasting emerging technologies, assessing performances of regional/national innovation systems, mapping technologies, managing R&D activities, or evaluating the collaboration potential at company or policy level [1,2].An essential subtask within these processes is classifying patents into coherent groups of similar documents as base for further retrieval and assessment
After hy- After hyperparameter tuning, four different base classifiers compute their predictions from tuning, four different base classifiers compute their predictions from distinct features ofdisthe perparameter tuning, four different base classifiers compute their predictions from dissame object (Stage to be merged by various fusion methods providing the final tinct features of the same object (Stage to be merged by various fusion methods tinct features of the same object (Stage I) to be merged by various fusion methods prediction
By inserting a root node, the trees of all assigned class codes are unified in one tree structured graph, which facilitates the computing of the distance between the structured codes of Cooperate Patent Classification (CPC)/International Patent Classification (IPC) taxonomy using the tree-edit-distance [33]

Summary

Introduction

The analysis of patents is one of the core duties of technology and innovation management with varying purposes and perspectives, such as forecasting emerging technologies, assessing performances of regional/national innovation systems, mapping technologies, managing R&D activities, or evaluating the collaboration potential at company or policy level [1,2]. A plethora of fusion methods, e.g., rule-based approaches such as summing and averaging, or stacking with machine leaning algorithms acting as fusion classifiers, are available to customize ensemble classification systems to specific tasks. Despite of these advantages, so far patent categorization into user-defined groups via ensemble classifier systems has rarely been studied. The IPC and especially the much more detailed CPC are transformed by means of DSE, an established method that—to the best of our knowledge—is applied to patent taxonomies for the first time Both textual and taxonomic features serve as input to four different machine learning base classifiers to assign patents into six classes of future biomedical innovation. The conclusions point out the major findings and important future perspectives

Official Patent Classification Systems

Automated Text Categorization using Patents

Ensemble Classification

Feature Extraction from Graphs

Materials and Methods

General

Patent

Textual Data

Tree Creation

Vector Space Embedding

Prototype Selection Methods

Classifier Selection and Hyperparameter Tuning

Fusion Methods

Experimental Design

Basic Evaluation

Ensemble Evaluation

Boosting

Outlook

Performance

Limitations

Findings

Conclusions

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Applied Sciences	Publication Date: Jan 12, 2021
Citations: 1	License type: CC BY 4.0

R Discovery Prime

On the Potential of Taxonomic Graphs to Improve Applicability and Performance for the Classification of Biomedical Patents

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Applied Sciences

Lead the way for us

Similar Papers

특허문서 필드의 기능적 특성을 활용한 IPC 다중 레이블 분류
Sora Lim ... Yongjin Kwon
Journal of Internet Computing and Services | VOL. 18
Sora Lim, et. al.Sora Lim ... Yongjin Kwon
28 Feb 2017
Journal of Internet Computing and Services | VOL. 18

Comparing the IPC and the US classification systems for the patent searcher
Stephen Adams
World Patent Information | VOL. 23
Stephen AdamsStephen Adams
01 Mar 2001
World Patent Information | VOL. 23

Lost in Patent Classification
Koji Meguro ... Yoshiyuki Osabe
World Patent Information | VOL. 57
Koji Meguro, et. al.Koji Meguro ... Yoshiyuki Osabe
13 Apr 2019
World Patent Information | VOL. 57

Patent Statistics as a Measure of Technical Change
William S Comanor ... F M Scherer
Journal of Political Economy | VOL. 77
William S Comanor, et. al.William S Comanor ... F M Scherer
01 May 1969
Journal of Political Economy | VOL. 77

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

On the Potential of Taxonomic Graphs to Improve Applicability and Performance for the Classification of Biomedical Patents

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Applied Sciences