Performance Comparison of Transformer-Based Models on Twitter Health Mention Classification

Pervaiz Iqbal Khan,Sheraz Ahmed,Andreas Dengel,Imran Razzak

doi:10.1109/tcss.2022.3143768

Pervaiz Iqbal Khan, Sheraz Ahmed + Show 2 more

https://doi.org/10.1109/tcss.2022.3143768

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Health mention classification classifies a given piece of text as a health mention or not. However, figurative usage of disease words makes the classification task challenging. To address this challenge, consideration of emojis and surrounding words of the disease names in the text can be helpful. Transformer-based methods are better at capturing the meaning of a word based on its surrounding words compared to traditional methods. However, there are numerous transformer-based methods available and pretrained on natural language processing (NLP) data that are inherently different from Twitter data. Moreover, the size of these models varies in terms of the number of parameters. Hence, it is challenging to decide and choose one of these methods for fine-tuning it on the downstream tasks such as tweet classification. In this work, we experiment with nine widely used transformer methods and compare their performance on the personal health mention classification of tweet data. Furthermore, we analyze the impact of model size on the classification task and provide a brief interpretation of the classification decision made by the best performing classifier. Experimental results show that RoBERTa outperforms all other models by achieving an F1 score of 93%, while two other models perform similarly by achieving an F1 score of 92.5%.

Full Text