Deep learning based pipeline with multichannel inputs for patent classification

Mustafa Sofean

doi:10.1016/j.wpi.2021.102060

Abstract

This work introduces a deep learning pipeline for automatic patent classification with multichannel inputs based on LSTM and word vector embeddings. Sophisticated text mining methods are used to extract the most important segments from patent texts, and a domain-specific pre-trained word embeddings model for the patent domain is developed; it was trained on a very large dataset of more than five million patents. The deep learning pipeline is using multiple parallel LSTM networks that read the source patent document using different input dimensions namely embeddings of different segments of patent texts, and sparse linear input of different metadata. Classifying patents into corresponding technical fields is selected as a use case. In this use case, a series of patent classification experiments are conducted on different patent datasets, and the experimental results indicate that using the segments of patent texts as well as the metadata as multichannel inputs for a deep neural network model, achieves better performance than one input channel.

Highlights

For the deep components of the model, deep layers are created for the most important patent text segments. These are sequential input to a Long Short-Term Memory (LSTM) network that takes the embeddings as inputs that are obtained by using a pre-trained word embeddings model to encode each segment texts into vectors, and we feed them into LSTM layers
The result in this work indicates that using the segments of patent text as multichannel inputs improved the performance of patent classification in terms of all evaluation criteria
We introduced a deep learning based pipeline for large-scale patent classification

Summary

Methods

Patent classification is a kind of knowledge management where documents are assigned into predefined categories. Due to the extremely complicated patent language and hierarchical patent classification scheme, many previous studies focused only on whole texts of patent or some general sections such as title, abstract, detailed description and claims [2] [1]. They did not consider the most important sections like background, technical field, summary, and independent claims that need specific text mining tools to extract

Semantic Structure of patent and Embeddings

Deep Learning based Pipeline Architecture

Conclusion

Experimental Results

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: World Patent Information	Publication Date: Jul 8, 2021
Citations: 5	License type: cc-by

R Discovery Prime

R Discovery Prime

Deep learning based pipeline with multichannel inputs for patent classification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: World Patent Information

Lead the way for us

Similar Papers

Literature listing
David Newton
World Patent Information | VOL. 34
David NewtonDavid Newton
30 Dec 2011
World Patent Information | VOL. 34

Patent text modeling strategy and its classification based on structural features
Zhao Ruijie ... Lu Yonghe
World Patent Information | VOL. 67
Zhao Ruijie, et. al.Zhao Ruijie ... Lu Yonghe
08 Nov 2021
World Patent Information | VOL. 67

Toward hippocampal volume measures on ultra-high field magnetic resonance imaging: a comprehensive comparison study between deep learning and conventional approaches.
Junyan Lyu ... Perry F Bartlett
Frontiers in neuroscience | VOL. 17
Junyan Lyu, et. al.Junyan Lyu ... Perry F Bartlett
14 Dec 2023
Frontiers in neuroscience | VOL. 17

Domain-specific word embeddings for patent classification
Julian Risch ... Ralf Krestel
Data Technologies and Applications | VOL. 53
Julian Risch, et. al.Julian Risch ... Ralf Krestel
29 Mar 2019
Data Technologies and Applications | VOL. 53

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Deep learning based pipeline with multichannel inputs for patent classification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: World Patent Information