Abstract

Existing multilabel text classification methods rely on a complex manual design to mine label correlation, which has the risk of overfitting and ignores the relationship between text and labels. To solve the above problems, this paper proposes a multilabel text classification algorithm based on a transformer encoder–decoder, which can adaptively extract the dependency relationship between different labels and text. First, text representation learning is carried out through word embedding and a bidirectional long short-term memory network. Second, the global relationship of the text is modeled by the transformer encoder, and then the multilabel query is adaptively learned by the transformer decoder. Last, a weighted fusion strategy under the supervision of multiple loss functions is proposed to further improve the classification performance. The experimental results on the AAPD and RCV1-V2 datasets show that compared with the existing methods, the proposed algorithm achieves better classification results. The optimal micro-F1 reaches 73.4% and 87.8%, respectively, demonstrating the effectiveness of the proposed algorithm.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.