Abstract

Important information in scientific papers can be composed of rhetorical sentences that is structured from certain categories. To get this information, text categorization should be conducted. Actually, some works in this task have been completed by employing word frequency, semantic similarity words, hierarchical classification, and the others. Therefore, this paper aims to present the rhetorical sentence categorization from scientific paper by employing TF-IDF and Word2Vec to capture word frequency and semantic similarity words and employing hierarchical classification. Every experiment is tested in two classifiers, namely Naïve Bayes and SVM Linear. This paper shows that hierarchical classifier is better than flat classifier employing either TF-IDF or Word2Vec, although it increases only almost 2% from 27.82% when using flat classifier until 29.61% when using hierarchical classifier. It shows also different learning model for child-category can be built by hierarchical classifier.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.