Measuring the quality of web content using factual information

Elisabeth Lex,Benno Stein,Christopher Horn,Leticia Cagnina,Michael Granitzer,Marcelo Errecalde,Michael Voelske,Edgardo Ferretti

doi:10.1145/2184305.2184308

Abstract

Nowadays, many decisions are based on information found in the Web. For the most part, the disseminating sources are not certified, and hence an assessment of the quality and credibility of Web content became more important than ever. With factual density we present a simple statistical quality measure that is based on facts extracted from Web content using Open Information Extraction. In a first case study, we use this measure to identify featured/good articles in Wikipedia. We compare the factual density measure with word count, a measure that has successfully been applied to this task in the past. Our evaluation corroborates the good performance of word count in Wikipedia since featured/good articles are often longer than non-featured. However, for articles of similar lengths the word count measure fails while factual density can separate between them with an F-measure of 90.4%. We also investigate the use of relational features for categorizing Wikipedia articles into featured/good versus non-featured ones. If articles have similar lengths, we achieve an F-measure of 86.7% and 84% otherwise.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Measuring the quality of web content using factual information

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Measuring Web Content Credibility Using Predictive Models
R Manjula ... M S Vijaya
-
R Manjula, et. al.R Manjula ... M S Vijaya
01 Jan 2020
01 Jan 2020

Understanding and predicting Web content credibility using the Content Credibility Corpus
Michal Kakol ... Adam Wierzbicki
Information Processing & Management | VOL. 53
Michal Kakol, et. al.Michal Kakol ... Adam Wierzbicki
03 May 2017
Information Processing & Management | VOL. 53

Deep Neural Network for Evaluating Web Content Credibility Using Keras Sequential Model
R Manjula ... M S Vijaya
-
R Manjula, et. al.R Manjula ... M S Vijaya
01 Jan 2020
01 Jan 2020

Modelling sequences using pairwise relational features
Rudy Sicard ... Thierry Artières
Pattern Recognition | VOL. 42
Rudy Sicard, et. al.Rudy Sicard ... Thierry Artières
06 Dec 2008
Pattern Recognition | VOL. 42

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Measuring the quality of web content using factual information

Abstract

Talk to us

Similar Papers