Improving Wikipedia verifiability with AI

Fabio Petroni,Samuel Broscheit,Aleksandra Piktus,Patrick Lewis,Gautier Izacard,Lucas Hosseini,Jane Dwivedi-Yu,Maria Lomeli,Timo Schick,Michele Bevilacqua,Pierre-Emmanuel Mazaré,Armand Joulin,Edouard Grave,Sebastian Riedel

doi:10.1038/s42256-023-00726-1

Abstract

Verifiability is a core content policy of Wikipedia: claims need to be backed by citations. Maintaining and improving the quality of Wikipedia references is an important challenge and there is a pressing need for better tools to assist humans in this effort. We show that the process of improving references can be tackled with the help of artificial intelligence (AI) powered by an information retrieval system and a language model. This neural-network-based system, which we call SIDE, can identify Wikipedia citations that are unlikely to support their claims, and subsequently recommend better ones from the web. We train this model on existing Wikipedia references, therefore learning from the contributions and combined wisdom of thousands of Wikipedia editors. Using crowdsourcing, we observe that for the top 10% most likely citations to be tagged as unverifiable by our system, humans prefer our system’s suggested alternatives compared with the originally cited reference 70% of the time. To validate the applicability of our system, we built a demo to engage with the English-speaking Wikipedia community and find that SIDE’s first citation recommendation is preferred twice as often as the existing Wikipedia citation for the same top 10% most likely unverifiable claims according to SIDE. Our results indicate that an AI-based system could be used, in tandem with humans, to improve the verifiability of Wikipedia.

Full Text