Abstract

Classification techniques are at the heart of many real-world applications, e.g. sentiment analysis, recommender systems and automatic text annotation, to process and analyse large-scale textual data in multiple fields. However, the effectiveness of natural language processing models can only be confirmed when a large amount of up-to-date training data is available. An unprecedented amount of data is continuously created, and new topics are introduced, making it less likely or even infeasible to collect labelled samples covering all topics for training models. We attempt to study the extreme case: there is no labelled data for model training, and the model, without being adapted to any specific dataset, will be directly applied to the testing samples. We propose a transformer-based framework to encode sentences in a contextualised way and leverage the existing knowledge resources, i.e. ConceptNet and WordNet, to integrate both descriptive and structural knowledge for better performance. To enhance the robustness of the model, we design an adversarial example generator based on relations from external knowledge bases. The framework is evaluated on both general and specific domain text classification datasets. Results show that the proposed framework can outperform the existing competitive state-of-the-art baselines, delivering new benchmark results.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.