Abstract

This paper presents a performance analysis and comparison of several pre-processing methods used in automatic patent classification with graph kernels for Support Vector Machine (SVM). The pre-processing methods are based on the data transform techniques, namely data scaling, data centering, data standardization, data normalization, the Box-Cox transform and the Yeo-Johnson transform. The automatic patent classification is designed to classify an input of patent citation graphs into one of 10 possible classes of the International Patent Classification (IPC). The input is taken with various background conditions. The experiments showed that the best result is achieved when the pre-processing method is data normalization, achieving a classification accuracy of up to 85.33.15% for the KEHL and 93.80% for the KVHL. In contrast, for the KEHG, the preprocessing method application decreased the accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call