TY - JOUR
T1 - Exploring cultural commonsense in multilingual large language models
T2 - A survey
AU - Binegde, Geleta Negasa
AU - Zhang, Huaping
N1 - Publisher Copyright:
© 2025 Elsevier Ltd.
PY - 2026/6
Y1 - 2026/6
N2 - Large language models (LLMs) have demonstrated impressive proficiency in multilingual natural language processing (NLP), yet they frequently struggle with cultural commonsense—the implicit knowledge shaped by societal norms, traditions, and shared experiences. As these models are deployed in diverse linguistic and cultural settings, their ability to understand and apply cultural commonsense becomes crucial for ensuring fairness, inclusivity, and contextual accuracy. This paper presents a systematic review and a large-scale empirical benchmark for evaluating cultural commonsense in multilingual LLMs. Through a comprehensive evaluation of 15 models on the BLEnD dataset, our analysis reveals a critical performance gap of 64.2% between high-resource and low-resource cultures. The results demonstrate significant disparities across model architectures: encoder-only models show more consistent but lower overall performance compared to decoder-based models. We identify key limitations, including data scarcity, representational bias, and inadequate cross-lingual knowledge transfer. Finally, we propose future research directions, such as culturally diverse dataset curation, hybrid knowledge graph architectures, and fairness-aware fine-tuning. The primary contributions of this work are (1) a systematic review of challenges and mitigation strategies for cultural commonsense; (2) a large-scale empirical benchmark that evaluates 15 multilingual LLMs across 13 languages and 16 countries, revealing significant performance disparities; and (3) concrete findings on the effects of model architecture and the limitations of scale in cultural understanding. This research underscores the urgent need to advance cultural commonsense in multilingual LLMs to ensure the development of fair, inclusive, and contextually accurate AI systems globally.
AB - Large language models (LLMs) have demonstrated impressive proficiency in multilingual natural language processing (NLP), yet they frequently struggle with cultural commonsense—the implicit knowledge shaped by societal norms, traditions, and shared experiences. As these models are deployed in diverse linguistic and cultural settings, their ability to understand and apply cultural commonsense becomes crucial for ensuring fairness, inclusivity, and contextual accuracy. This paper presents a systematic review and a large-scale empirical benchmark for evaluating cultural commonsense in multilingual LLMs. Through a comprehensive evaluation of 15 models on the BLEnD dataset, our analysis reveals a critical performance gap of 64.2% between high-resource and low-resource cultures. The results demonstrate significant disparities across model architectures: encoder-only models show more consistent but lower overall performance compared to decoder-based models. We identify key limitations, including data scarcity, representational bias, and inadequate cross-lingual knowledge transfer. Finally, we propose future research directions, such as culturally diverse dataset curation, hybrid knowledge graph architectures, and fairness-aware fine-tuning. The primary contributions of this work are (1) a systematic review of challenges and mitigation strategies for cultural commonsense; (2) a large-scale empirical benchmark that evaluates 15 multilingual LLMs across 13 languages and 16 countries, revealing significant performance disparities; and (3) concrete findings on the effects of model architecture and the limitations of scale in cultural understanding. This research underscores the urgent need to advance cultural commonsense in multilingual LLMs to ensure the development of fair, inclusive, and contextually accurate AI systems globally.
KW - Commonsense knowledge
KW - Cultural bias
KW - Cultural understanding
KW - Multilingual LLMs
UR - https://www.scopus.com/pages/publications/105023690827
U2 - 10.1016/j.is.2025.102649
DO - 10.1016/j.is.2025.102649
M3 - Article
AN - SCOPUS:105023690827
SN - 0306-4379
VL - 138
JO - Information Systems
JF - Information Systems
M1 - 102649
ER -