Abstract

This paper introduces a measure of the proximity in ideas using unsupervised machine learning. Knowledge transfers are considered a key driving force of innovation and regional economic growth. I explore knowledge relationships by deriving vector space representations of a patent's abstract text using Document Vectors (Doc2Vec), and using cosine similarity to measure their proximity in ideas space. I illustrate the potential uses of this method with an application to geographic localization in knowledge spillovers. For patents in the same technology field, their normalized text similarity is 0.02-0.05 S.D.s higher if they are located within the same city, compared to patents from other cities. This effect is much smaller than when knowledge transfers are measured using normalized patent citations: local patents receive about 0.23-0.30 S.D.s more local citations than compared to non-local control patents. These findings suggest that the effect of geography on knowledge transfers may be much smaller than the previous literature using citations suggests.

Highlights

  • This paper introduces a measure of proximity in ideas using unsupervised machine learning

  • I explore knowledge relationships in innovative ideas by: first, deriving vector space representations of patent abstract text using Document Vectors (Doc2Vec); second, using cosine similarity to measure their proximity in ideas space

  • In (Table 8), I find that similarity for patents that share non-patent citations are much higher, which supports the claim that patent text reflects the knowledge flows from external sources

Read more

Summary

Introduction

This paper introduces a measure of proximity in ideas using unsupervised machine learning. If patent lawyers do not have an important role to play in determining the localization of patent citations, further selecting a control that matches on both primary class and attorney should not yield very different results for localization, compared to the baseline replication If these results prove significant, this may explain why better measures of technological proximity do not yield lower estimates of localization: citations may be localized in part because lawyers’ knowledge of “citable” patents are geographically concentrated, not necessarily because knowledge flows across patents are. These results indicate that including technology proximity in primary class would attenuate some biases towards higher similarity in patent text due to the influence of external parties. In (Table 7), the estimate of localization diminishes further with the inclusion of primary class similarity as a control for technological proximity: the estimate of localization ranges from insignificant to 0.04 S.D.s above the mean; the interaction effect is insignificant at the 5% level across all decades

Summary of results
Findings
Discussion of patent text similarity
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.