Abstract
Although an improvement of hierarchical text classification can be achieved by using hierarchical structure information, existing hierarchical text classification methods suffer from a problem, namely error propagation (especially in large-scale deep hierarchy). In this paper, we define the concept of path-based semantic vector for the presentation of categories based on which prior information provided by training set can be employed in a classifier-independent way to reduce and further eliminate classification errors. In particular, we first propose the occurrence probability based strategy for hierarchical text classification which can help limit errors rate efficiently. Cooccurrence probability is then introduced to correct the classification errors occurred on higher levels of the hierarchy. Extensive experiments show that our hierarchical classification strategies perform well on ODP dataset, even on deep levels of the hierarchy.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.