Abstract

At present, the weakly supervised model is usually used for the expansion of the event corpus, which avoids the expensive manual annotation process. However, the weakly supervised model relies on the knowledge base and a small part of manually annotated corpus data, which makes the model have the problems of poor portability. In order to solve this problem, we construct a public domain event extraction model using syntax tree. In this paper, we propose a classification structure of Chinese syntax tree according to the view of event extraction, and put forward an event extraction algorithm for various syntax tree types. Moreover, in the construction algorithm of trigger word dictionary, we use cross-corpus dictionary information to construct Chinese trigger word dictionary from the perspective of semantics. As a result, we obtain 40,128 Chinese news events, which initially constituted the corpus of Chinese new events.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.