Abstract

The number of patent applications has grown to a size that makes it necessary to process a large amount of data as automated as possible. For professionals analyzing technical developments, there is much need for quickly gaining an overview of a technical domain. We have conducted a large case study that covers over 3 years of work at the Austrian Patent Office, where we developed a toolchain of text mining methods for use by patent experts. Thus, acceptance by the domain expert community has been the major factor in our study. Apart from practical aspects, we experimented with a method to enrich the usual pure text-based topic modeling approaches with available meta-data in the form of IPC symbols, in an attempt to guide the topics towards readily recognizable labels. In this work, we describe this set of tools and the best practices and lessons learned during this extensive period, covering 7 case studies, out of which we describe two in detail.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call