Abstract
With society going online and disinformation getting accepted as a phenomena that we have to live with, there is a growing need to automatically detect offensive text on modern social media platforms. But the lack of enough balanced labeled data, constantly evolving socio-linguistic patterns and ever-changing definition of offensive text make it a challenging task. This is a common pattern witnessed in all disinformation detection tasks such as detection of propaganda, rumour, fake news, hate etc. The work described in this paper improves upon the existing body of techniques by bringing in an approach framework that can surpass the existing benchmarks. Firstly, it addresses the imbalanced and insufficient nature of available labeled dataset. Secondly, learning using relates tasks through multi-task learning has been proved to be an effective approach in this domain but it has the unrealistic requirement of labeled data for all related tasks. The framework presented here suitably uses transfer learning in lieu of multi-task learning to address this issue. Thirdly, it builds a model explicitly addressing the hierarchical nature in the taxonomy of disinformation being detected as that delivers a stronger error feedback to the learning tasks. Finally, the model is made more robust by adversarial training. The work presented in this paper uses offensive text detection as a case study and shows convincing results for the chosen approach. The framework adopted can be easily replicated in other similar learning tasks facing a similar set of challenges.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.