Factual inconsistencies in multilingual Wikipedia tables

Cappa, Silvia; Kong, Lingxiao; Peet, Pille-Riin; Wei, Fanfu; Zhou, Yuchen; Kalo, Jan-Christoph
RTF Work at ISWS Summer School 2025, Knowledge graphs for reliable AI, 8-14 June 2025, Bertinoro, Italy / Also Submitted to ArXiV, 24 July 2025

Wikipedia serves as a globally accessible knowledge source with content in over 300 languages. Despite covering the same topics, the different versions of Wikipedia are written and updated independently. This leads to factual inconsistencies that can impact the neutrality and reliability of the encyclopedia and AI systems, which often rely on Wikipedia as a main training source. This study investigates crosslingual inconsistencies in Wikipedia’s structured content, with a focus on tabular data. We developed a methodology to collect, align, and analyze tables from Wikipedia multilingual articles, defining categories of inconsistency. We apply various quantitative and qualitative metrics to assess multilingual alignment using a sample dataset. These insights have implications for factual verification, multilingual knowledge interaction, and design for reliable AI systems leveraging Wikipedia content.


Type:
Poster / Demo
City:
Bertinoro
Date:
2025-06-08
Department:
Data Science
Eurecom Ref:
8699
Copyright:
© EURECOM. Personal use of this material is permitted. The definitive version of this paper was published in RTF Work at ISWS Summer School 2025, Knowledge graphs for reliable AI, 8-14 June 2025, Bertinoro, Italy / Also Submitted to ArXiV, 24 July 2025 and is available at :
See also:

PERMALINK : https://www.eurecom.fr/publication/8699