Conference on Information and Knowledge Management 2023

This week our article “A Comparative Study of Reference Reliability in Multiple Language Editions of Wikipedia” was presented in the 32nd ACM International Conference on Information and Knowledge Management (CIKM2023). This study expands a recent longitudinal assessment of reference quality on English Wikipedia by examining the reliability of references in over 300 language editions of Wikipedia in order to study the cross-edition effects. This was a collaborative work between Aitolkyn Baigutanova, Diego Saez-Trumper, Miriam Redi, Meeyoung Cha, and myself and we hope to continue to advance research on the quality and verifiability of references to ultimately help editors improve knowledge integrity on Wikipedia.

Abstract:

Information presented in Wikipedia articles must be attributable to reliable published sources in the form of references. This study examines over 5 million Wikipedia articles to assess the reliability of references in multiple language editions. We quantify the cross-lingual patterns of the perennial sources list, a collection of reliability labels for web domains identified and collaboratively agreed upon by Wikipedia editors. We discover that some sources (or web domains) deemed untrustworthy in one language (i.e., English) continue to appear in articles in other languages. This trend is especially evident with sources tailored for smaller communities. Furthermore, non-authoritative sources found in the English version of a page tend to persist in other language versions of that page. We finally present a case study on the Chinese, Russian, and Swedish Wikipedias to demonstrate a discrepancy in reference reliability across cultures. Our finding highlights future challenges in coordinating global knowledge on source reliability.


Leave a comment