Abstract

The performance of text classification has improved tremendously using intelligently engineered neural-based models, especially those injecting categorical metadata as additional information, e.g., using user/product information for sentiment classification. This information has been used to modify parts of the model (e.g., word embeddings, attention mechanisms) such that results can be customized according to the metadata. We observe that current representation methods for categorical metadata, which are devised for human consumption, are not as effective as claimed in popular classification methods, outperformed even by simple concatenation of categorical features in the final layer of the sentence encoder. We conjecture that categorical features are harder to represent for machine use, as available context only indirectly describes the category, and even such context is often scarce (for tail category). To this end, we propose using basis vectors to effectively incorporate categorical metadata on various parts of a neural-based model. This additionally decreases the number of parameters dramatically, especially when the number of categorical features is large. Extensive experiments on various data sets with different properties are performed and show that through our method, we can represent categorical metadata more effectively to customize parts of the model, including unexplored ones, and increase the performance of the model greatly.

Highlights

  • Text classification is the backbone of most NLP tasks: review classification in sentiment analysis (Pang et al, 2002), paper classification in scientific data discovery (Sebastiani, 2002), and question classification in question answering (Li and Roth, 2002), to name a few

  • Metadata is generated for human understanding, and we claim that these categories need to be carefully represented for machine use to Transactions of the Association for Computational Linguistics, vol 7, pp. 201–215, 2019

  • We present five levels of Customized Bidirectional Long Short Term Memory (BiLSTM), which differ on the location where we inject the categorical features, listed here from the highest level to the lowest level of dependencies between text and categories

Read more

Summary

Introduction

Text classification is the backbone of most NLP tasks: review classification in sentiment analysis (Pang et al, 2002), paper classification in scientific data discovery (Sebastiani, 2002), and question classification in question answering (Li and Roth, 2002), to name a few. We are inspired by the advancement in neural-based models, incorporating categorical information ‘‘as is’’ and injecting it on various parts of the model such as in the word embeddings (Tang et al, 2015), attention mechanism (Chen et al, 2016; Amplayo et al, 2018a) and memory networks (Dou, 2017). These methods theoretically make use of combined features from both textual and categorical features, which make them more powerful than disconnected features.

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.