Abstract

Recent years have witnessed a booming increase of patent applications, which provides an open chance for revealing the inner law of innovation, but in the meantime, puts forward higher requirements on patent mining techniques. Considering that patent mining highly relies on patent document analysis, this paper makes a focused study on constructing a technology portrait for each patent, i.e., to recognize technical phrases concerned in it, which can summarize and represent patents from a technology angle. To this end, we first give a clear and detailed description about technical phrases in patents based on various prior works and analyses. Then, combining characteristics of technical phrases and multi-level structures of patent documents, we develop an Unsupervised Multi-level Technical Phrase Extraction (UMTPE) model. Particularly, a novel evaluation metric called Information Retrieval Efficiency (IRE) is designed to evaluate the extracted phrases from a new perspective, which greatly supplements traditional metrics like Precision and Recall. Finally, extensive experiments on real-world patent data show the effectiveness of our UMTPE model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call