Abstract

Contract analysis can significantly ease the work for humans using AI techniques. This paper shows a lengthy nested NER problem of element tagging on insurance policy (ETIP). Compared to NER, ETIP deals with not only different types of entities which vary from a short phrase to a long sentence, but also phrase or clause entities that could be nested. We present a novel hybrid framework of deep learning and heuristic filtering method to recognize the lengthy nested elements. First, a convolutional neural network is constructed to obtain good initial candidates of sliding windows with high softmax probability. Then, the concatenation operator on adjacent candidate segments is introduced to create phrase, clause, or sentence candidates. We design an effective voting strategy to resolve the classification conflict of the concatenated candidates and present a theoretical proof of F1-score optimization. In experiments, we have collected a large Chinese insurance contract dataset to test the performance of the proposed method. An extensive set of experiments is performed to investigate how sliding window candidates can work effectively in our filtering and voting strategy. The optimal parameters are determined by statistical analysis of the experimental data. The results show the promising performance of our method in the ETIP problem.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.