Abstract

The purpose of this research is to retrieve relevant patent documents and identify classification codes and search keywords that best characterize a given technological domain found in patent literature. The World Intellectual Property Organization (WIPO) recorded a rising number of patent applications filed under the Patent Cooperation Treaty (PCT) which is becoming the norm for filing patents in multiple jurisdictions. As such, PCT documents are a valuable source of information related to innovation activities with some degree of entrepreneurial intention. However, searching for relevant patent documents can be a daunting and uncertain process. We constructed a high-dimensional matrix consisting of two data types: classification codes and search keywords known as the code-keyword matrix. In turn, two machine learning algorithms called principal components analysis (PCA) and k-means clustering were used to derive insights from the high-dimensional dataset. Consequently, a two-dimensional PCA biplot and clustering on an optimized PCA dataset called Eigen-PCA were obtained using our combined machine learning method. Using such algorithms, we were able to identify correlation relationships found between the two data types. We also clustered the classification codes by least-relevance, medium-relevance, and high-relevance for the domain of anti-corrosion technologies, an impactful area for steel infrastructure in maritime environments. Such patent data analytics can be adapted to other areas such as medical technologies, green energy transition towards Net Zero and conservation of biological diversity.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.