Abstract

Keywords and keyphrases are important for Natural Language Processing (NLP) applications such as document classification, information retrieval, and topic identification. They are also useful for capturing different classes of entities from content related to healthcare, biology, food science, and journalism fields. There are different approaches to extract keywords and keyphrases. Deep learning approaches have achieved high-performance results in terms of keywords and keyphrase extraction. However, among deep learning approaches, Convolutional Neural Network (CNN) potentials have not been fully explored as a technique for extracting keywords and keyphrases. In this work, we performed a comparative study using a benchmark dataset, the IEEE Xplore collection to test the CNN generalization ability in selecting keywords and keyphrases. In addition, we further collected a corpus in the field of foodborne illness outbreaks. We utilize this corpus to develop a CNN-based identification approach of keywords and keyphrases related to foodborne illnesses. Results were compared with several supervised (KEA, GuidedLDA) and unsupervised (LDA) machine learning algorithms. CNN outperformed these algorithms in selecting relevant keywords and keyphrases for foodborne illnesses. The findings of this study have also confirmed superiority of CNN-based algorithm for keyphrase extraction to other machine learning approaches.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.