A hybrid approach to classifying Wikipedia article quality flaws with feature fusion framework

Ping Wang,Muyan Li,Xiaodan Li,Heshen Zhou,Jingrui Hou

doi:10.1016/j.eswa.2021.115089

Abstract

Article quality has always been a major concern for Wikipedia. To improve article quality, it is critical to first identify defects. Thus, flaw classification has attracted considerable attention. To achieve this, several machine-learning-based approaches are available, including deep learning models based on either manually constructed or autoextracted features. However, adopting only features of either single type may not ensure a comprehensive description of articles. To improve flaw classification, we propose a feature fusion framework combining both handcrafted and autoextracted features. In this research, we first use a rule-based method from a previously proposed framework to extract handcrafted features. Additionally, we obtain autoextracted features using Bidirectional Encoder Representations from Transformers (BERT) and various deep learning models, including bidirectional long short-term memory (Bi LSTM), bidirectional gated recurrent unit (Bi GRU), bidirectional recurrent neural network (Bi RNN), and multihead self-attention models. Finally, the handcrafted features are standardized and concatenated with the autoextracted features. Then, the concatenated features are fed into a feedforward neural network for classification. A detailed comparison of different classifiers is conducted. We compare 12 different classifiers in terms of training performance, classification performance, and model training time. The experiments show that the proposed feature fusion framework can notably improve the effectiveness of quality flaw classification for Wikipedia articles. In particular, a Bi GRU model based on the proposed framework achieves excellent classification accuracy.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A hybrid approach to classifying Wikipedia article quality flaws with feature fusion framework

Abstract

Talk to us

Similar Papers

More From: Expert Systems with Applications

Lead the way for us

Journal: Expert Systems with Applications	Publication Date: Apr 22, 2021
Citations: 8

Similar Papers

Bidirectional encoders to state-of-the-art: a review of BERT and its transformative impact on natural language processing
Rajesh Gupta
Информатика. Экономика. Управление - Informatics. Economics. Management | VOL. 3
Rajesh GuptaRajesh Gupta
02 Mar 2024
Информатика. Экономика. Управление - Informatics. Economics. Management | VOL. 3

Applying Transformer Models for Disease Named Entity Recognition
S.T Jarashanth ... R.D Nawarathna
-
S.T Jarashanth, et. al.S.T Jarashanth ... R.D Nawarathna
23 Feb 2022
23 Feb 2022

Cross2Self-attentive Bidirectional Recurrent Neural Network with BERT for Biomedical Semantic Text Similarity
Zhengguang Li ... Zhihao Yang
-
Zhengguang Li, et. al.Zhengguang Li ... Zhihao Yang
16 Dec 2020
16 Dec 2020

T-BERT:臺灣語言模型–以臺灣在地語言預訓練BERT模型

-

01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A hybrid approach to classifying Wikipedia article quality flaws with feature fusion framework

Abstract

Talk to us

Similar Papers

More From: Expert Systems with Applications