Multitask deep learning for native language identification

Vuk Habic,Alexander Semenov,Eduardo L Pasiliao

doi:10.1016/j.knosys.2020.106440

Abstract

Identifying the native language of a person by their text written in English (L1 identification) plays an important role in such tasks as authorship profiling and identification. With the current proliferation of misinformation in social media, these methods are especially topical. Most studies in this field have focused on the development of supervised classification algorithms, that are trained on a single L1 dataset. Although multiple labeled datasets are available for L1 identification, they contain texts authored by speakers of different languages and do not completely overlap. Current approaches achieve high accuracy on available datasets, but this is attained by training an individual classifier for each dataset. Studies show that joint training of multiple classifiers on different datasets can result in sharing information between the classifiers, leading to an increase in the accuracy of both tasks. In this study, we develop a novel deep neural network (DNN) architecture for L1 classification; it is based on an adversarial multitask learning method that integrates shared knowledge from multiple L1 datasets. We propose several variants of the architecture and rigorously evaluate their performance on multiple datasets. Our results indicate the proposed multitask architecture is more efficient in terms of classification accuracy than previously proposed methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Knowledge-Based Systems	Publication Date: Sep 16, 2020
Citations: 12	License type: cc-by-nc-nd

R Discovery Prime

R Discovery Prime

Multitask deep learning for native language identification

Abstract

Talk to us

Similar Papers

More From: Knowledge-Based Systems

Lead the way for us

Similar Papers

Computing the Testing Error Without a Testing Set
Ciprian A Corneanu ... Aleix M Martinez
-
Ciprian A Corneanu, et. al.Ciprian A Corneanu ... Aleix M Martinez
01 Jun 2020
01 Jun 2020

Group sparse Bayesian learning for data-driven discovery of explicit model forms with multiple parametric datasets
Luning Sun ... Jian-Xun Wang
Numerical Algebra, Control and Optimization | VOL. 14
Luning Sun, et. al.Luning Sun ... Jian-Xun Wang
01 Jan 2024
Numerical Algebra, Control and Optimization | VOL. 14

Mining the Discussion of Monkeypox Misinformation on Twitter Using RoBERTa
Or Elroy ... Abraham Yosipof
-
Or Elroy, et. al.Or Elroy ... Abraham Yosipof
01 Jan 2023
01 Jan 2023

Health Misinformation in Search and Social Media
Amira Ghenai
-
Amira GhenaiAmira Ghenai
07 Aug 2017
07 Aug 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multitask deep learning for native language identification

Abstract

Talk to us

Similar Papers

More From: Knowledge-Based Systems