Abstract

In this era of information and communication technology, a large population relies on the Internet to gather information. One of the most popular information sources on the Internet is Wikipedia. Wikipedia is a free encyclopedia that provides a wide range of information to its users. However, there have been concerns about the readability of information on Wikipedia time and again. The readability of the text is defined as the ease of understanding the underlying text. Past studies have analyzed the readability of Wikipedia articles with the help of conventional readability metrics, such as the Flesch-Kincaid readability score and the Automatic Readability Index (ARI). Such metrics only consider the surface-level parameters, such as the number of words, sentences, and paragraphs in the text, to quantify the readability. However, the readability of the text must also take into account the quality of the text. In this study, we consider many new NLP-based parameters capturing the quality of the text, such as lexical diversity, semantic diversity, lexical complexity, and semantic complexity and analyze their impact on the readability of Wikipedia articles using artificial neural networks. Besides NLP parameters, the crowdsourced parameters also affect the readability, and therefore, we also analyze the impact of crowdsourced parameters and observe that the crowdsourced parameters not only influence the readability scores but also affect the NLP parameters of the text. Additionally, we investigate the mediating effect of NLP parameters that connect the crowdsourced parameters to the readability of the text. The results show that the impact of crowdsourced parameters on readability is partially due to the profound effect of NLP-based parameters.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.