COVCOR20 at WNUT-2020 Task 2: An Attempt to Combine Deep Learning and Expert rules

Ali Hürriyetoğlu,Ali Safaya,Osman Mutlu,Nelleke Oostdijk,Erdem Yörük

doi:10.18653/v1/2020.wnut-1.75

Abstract

In the scope of WNUT-2020 Task 2, we developed various text classification systems, using deep learning models and one using linguistically informed rules. While both of the deep learning systems outperformed the system using the linguistically informed rules, we found that through the integration of (the output of) the three systems a better performance could be achieved than the standalone performance of each approach in a cross-validation setting. However, on the test data the performance of the integration was slightly lower than our best performing deep learning model. These results hardly indicate any progress in line of integrating machine learning and expert rules driven systems. We expect that the release of the annotation manuals and gold labels of the test data after this workshop will shed light on these perplexing results.

Highlights

The COVID-19 pandemic urged various science disciplines to do their best so as to contribute to understanding and relieving its impact
We joined the community that aims at organizing data collected from social media, as informative and uninformative
Just as in BERT-CNN, we feed the output of the last hidden layer of CharRNN to a Convolutional Neural Network (CNN)

Summary

Introduction

The COVID-19 pandemic urged various science disciplines to do their best so as to contribute to understanding and relieving its impact. The task requires the participating teams to develop short-text classification systems that facilitate the training and development data to generalize to the test set they release. Each team was allowed to submit only two outputs of the systems they developed for classifying tweets on the Codalab page of the task.. Integrating automatically created machine learning based (ML) models with manually formulated rules to tackle a text classification task promises the best of both worlds. We pursued this goal by integrating the output of two deep learning models and a rule-based system under the team name COVCOR20.

Deep learning models

Rule-based system

Results

Conclusion and future work

Integrating outputs