Abstract

AbstractText simplification is a vital work for comprehending patent claims due to its complex syntactic structures and lengthy sentences. Therefore, almost all patent analysis practitioners cannot be able to directly and intuitively understand patent essence even through some common natural language processing (NLP) tools are applied to parse these patent claim paragraph or sentences. Universal text analysis tools above is almost useless, or even crashed when applied to some complex paragraphs of patent claims. Therefore, it is necessary to propose a patent text oriented simplification approach to help patent researchers grasp the essence of patent quickly and intuitively. Motivated by the above reason, we in this article propose a simplification method based on deep learning to segment patent claim into shorter and comprehensible sentences for downstream tasks of patent analysis. The proposed approach contains two stages: on one stage, we use a machine learning approach of conditional random field (CRF) to decompose syntactically complex paragraphs into coarse‐grained level sentences with simplified structures and complete semantics; on another stage, a deep Learning architecture of bidirectional long‐short term memory (Bi‐LSTM)‐CRF is applied to segment coarse‐grained and lengthy sentences of former stage into fined‐grained and shorter sentences. Compared with a series of baselines, our patent segmentation architecture based on deep learning of Bi‐LSTM‐CRF achieves higher performance than any other methods on the evaluation measures of precision, recall, and F1.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call