Abstract
In the field of natural language processing, hierarchical text classification (HTC) has emerged as a critical task for organizing and analyzing large volumes of text data. The previous work of HTC often falls short in fully leveraging the hierarchical structure of labels, resulting in suboptimal performance. In addition, it is difficult to capture nuanced relationships between parent and child classes, leading to inaccurate predictions and insufficient differentiation between sibling classes under the same parent category. This gap underscores the need for approaches that can more effectively integrate and utilize both hierarchical and corpus-specific information to improve HTC performance. To address these issues, Concept-aware Prompt Mechanism (CPM) is proposed for HTC, which leverages concept information embedded within hierarchical labels to enhance the representation of these labels and improve classification accuracy. Specifically, we introduce a concept initialization module that extracts concept features from hierarchical labels and a novel concept prompt template to integrate these features into the classification process. Our experimental results demonstrate that the proposed CPM achieves state-of-the-art performance on two benchmark datasets, improving Micro-F1 and Macro-F1 scores to varying degrees, particularly in datasets with complex label hierarchies.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have