Abstract

Patent litigation occurs when a company’s product or service violates the scope of another company’s patent rights. When they occur, companies suffer a disruption to the sales of their products and services, thus hindering the sustainability of their business activities. For this reason, companies have established and analyzed wide-ranging strategies to prevent patent litigation. Of those, statistical and machine learning-based quantitative methods using patent big data have several advantages, such as a reduced cost and objective results. Existing quantitative methods analyze patent information and litigation based on the time of data collection. However, the values of patents and their litigation hazards change over time. In addition, the existing methods do not take into account censored data; that is, patents that may result in litigation after the data is collected. In this paper, to solve this problem we propose an integrated survival model that considers censored data and predicts patent litigation hazards over time. The proposed model is a non-parametric survival analysis method based on a random survival forest. It uses pre-trained word2vec and clustering to effectively reflect the technology fields as well as the quantitative information of the patent. The word2vec is a technique for natural language processing and enables the use of patent text information. In order to examine the practicality of the integrated survival model, an experiment is conducted with patent big data related to sensor semiconductors based on AI technology applicable to robotics. In the experiment, it was found that the litigation hazard occurred 150 months after the patent application and increase rapidly from 200 months. Furthermore, the proposed model showed better predictive performance than other survival analysis models. The proposed model could be used by potential defendants to protect their patents.

Highlights

  • Patents contain varied and detailed information about the developed technology and indicate exclusive rights [1,2]

  • We propose a sustainable model that can predict patent litigation hazards over time based on the random survival forest (RSF)

  • We propose an integrated survival model that can predict patent litigation hazards over time based on the random survival forest

Read more

Summary

Introduction

Patents contain varied and detailed information about the developed technology and indicate exclusive rights [1,2]. Companies can realize profit-making through patents, and the patents legally protect the technology the companies have developed [3]. For this reason, the importance of patents has been increasingly drawing attention, and the number of global patent applications is rising [4]. Patent litigation occurs when a company’s patents, products or other operations infringe on the scope of other companies’ patent rights. Companies’ sales of products and services are disrupted. Quantitative methods that use statistics or machine learning based on a vast volume

Methods
Findings
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call