Abstract
In recent years, along with the dramatic developments of deep learning in the natural language processing (NLP) domain, notable multilingual pre-trained language techniques have been proposed. These recent multilingual text analysis and mining models have demonstrated state-of-the-art performance in several primitive NLP tasks, including cross-lingual text classification (CLC). However, these recent multilingual pre-trained language models still suffer limitations regarding their adaptation for specific task-driven fine-tuning in the context of low-resource languages. Moreover, they also encounter problems related to the capability of preserving the global semantic (e.g., topic, etc.) and long-range relationships between words to better fine-tune and effectively handle the cross-lingual text classification task. To meet these challenges, in this article, we propose a novel topic-driven multi-typed text graph attention–based representation learning method for dealing with the cross-lingual text classification problem called TG-CTC. In the proposed TG-CTC model, we utilize a novel fused topic-driven multi-typed text graph representation to jointly learn the rich-schematic structural and global semantic information of texts to effectively handle the CLC task. More specifically, we integrate the heterogeneous text graph attention network with the neural topic modelling approach to enrich the semantic information of learned textual representations in the context of multiple languages. Extensive experiments in benchmark multilingual datasets showed the effectiveness of the proposed TG-CTC model compared with the contemporary state-of-the-art baselines.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: ACM Transactions on Asian and Low-Resource Language Information Processing
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.