Abstract

Recently, sustainable growth and development has become an important issue for governments and corporations. However, maintaining sustainable development is very difficult. These difficulties can be attributed to sociocultural and political backgrounds that change over time [1]. Because of these changes, the technologies for sustainability also change, so governments and companies attempt to predict and manage technology using patent analyses, but it is very difficult to predict the rapidly changing technology markets. The best way to achieve insight into technology management in this rapidly changing market is to build a technology management direction and strategy that is flexible and adaptable to the volatile market environment through continuous monitoring and analysis. Quantitative patent analysis using text mining is an effective method for sustainable technology management. There have been many studies that have used text mining and word-based patent analyses to extract keywords and remove noise words. Because the extracted keywords are considered to have a significant effect on the further analysis, researchers need to carefully check out whether they are valid or not. However, most prior studies assume that the extracted keywords are appropriate, without evaluating their validity. Therefore, the criteria used to extract keywords needs to change. Until now, these criteria have focused on how well a patent can be classified according to its technical characteristics in the collected patent data set, typically using term frequency–inverse document frequency weights that are calculated by comparing the words in patents. However, this is not suitable when analyzing a single patent. Therefore, we need keyword selection criteria and an extraction method capable of representing the technical characteristics of a single patent without comparing them with other patents. In this study, we proposed a methodology to extract valid keywords from single patent documents using relevant papers and their authors’ keywords. We evaluated the validity of the proposed method and its practical performance using a statistical verification experiment. First, by comparing the document similarity between papers and patents containing the same search terms in their titles, we verified the validity of the proposed method of extracting patent keywords using authors’ keywords and the paper. We also confirmed that the proposed method improves the precision by about 17.4% over the existing method. It is expected that the outcome of this study will contribute to increasing the reliability and the validity of the research on patent analyses based on text mining and improving the quality of such studies.

Highlights

  • Companies are striving to gain a competitive advantage through new technology development

  • Research focuses on the quantitative analysis of unstructured data, including patents, by applying text mining

  • Patent analysis using text mining is based on words and are preceded by noise word removal and keyword extraction

Read more

Summary

Introduction

Companies are striving to gain a competitive advantage through new technology development. When their efforts result in success, they can expect some profit. Patent database services make it possible to examine patent documents in order to gain insight into technology management. The use of such databases has led to active research on patent analysis. Qualitative patent analysis by experts are problematic. The subjectivity of experts can be involved in their analysis results. This shows that the performance is better than the keyword extraction from the abstract.

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.