Abstract
Standard text categorization process relies on features selected for each category. To pick the most representative features for each category, training texts are used to ease the process of feature selection. Hence, selection of relevant training texts for right category will directly influences the performance of categorization. In this paper, we propose a transfer training approach to enable training of category model from one source and applied on another. This enables a quicker and focused training, which is useful for scientific text categorization. Performance evaluation was conducted under three settings, comprising different (i) numbers of categories and (ii) training texts (i.e. automated source selection from WikiCFP or manual source selection from Book TOC). The evaluation in an expert search setting showed that using more categories with manual sourcing of training texts(accuracy of 54.21%) outperforms other settings that use less categories or automated sourcing of training texts.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have