Abstract

The automatic extraction of key information is necessary for knowledge discovery in this era of rapid knowledge growth. The extraction of key information can also help researchers quickly obtain the information they want instead of reading through all potential documents. Recently, researchers have refocused their attention from words to sentences because utilizing sentences outperforms with respect to illustrating semantics and reduces the calculation complexity. We present a novel and lightweight automatic keyphrase extraction algorithm that does not depend on any external resources, including an external dictionary or corpus. Unlike traditional graph-based algorithms that iterate words to generate keyphrase lists, our proposal uses iterated sentences to rank words and generate keyphrase lists for the semantic information of sentences that are more complete than the word. We initialize the values of words with weighted information and generate a sentence score using these values. Then, we integrate sentences to update their values; hence, the values of the words are updated with the sentence information. We iterate this process until the values of the sentences and words converge. The proposed method is based on a measurement of the relations between sentences and an evaluation of the flow of these relations in an easily understood manner. These relationships are based on the hypothesis that the causality between adjacent sentences is semantically stronger than the causality between words. We not only increase the extraction accuracy, but also reduce the number of iterations of the algorithm. We compare our proposed method with five strong, popular baseline algorithms on four datasets. The results show that our proposed method performs better than the other algorithms on three evaluation metrics.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.