An Empirical Study on the Language Modal in Visual Question Answering

Daowan Peng, Wei Wei*, Xian Ling Mao, Yuanyuan Fu, Dangyang Chen

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

1 引用 (Scopus)

摘要

Generalization beyond in-domain experience to out-of-distribution data is of paramount significance in the AI domain. Of late, state-of-the-art Visual Question Answering (VQA) models have shown impressive performance on in-domain data, partially due to the language priors bias which, however, hinders the generalization ability in practice. This paper attempts to provide new insights into the influence of language modality on VQA performance from an empirical study perspective. To achieve this, we conducted a series of experiments on six models. The results of these experiments revealed that, 1) apart from prior bias caused by question types, there is a notable influence of postfix-related bias in inducing biases, and 2) training VQA models with word-sequence-related variant questions demonstrated improved performance on the out-of-distribution benchmark, and the LXMERT even achieved a 10-point gain without adopting any debiasing methods. We delved into the underlying reasons behind these experimental results and put forward some simple proposals to reduce the models' dependency on language priors. The experimental results demonstrated the effectiveness of our proposed method in improving performance on the out-of-distribution benchmark, VQA-CPv2. We hope this study can inspire novel insights for future research on designing bias-reduction approaches.

源语言英语
主期刊名Proceedings of the 32nd International Joint Conference on Artificial Intelligence, IJCAI 2023
编辑Edith Elkind
出版商International Joint Conferences on Artificial Intelligence
4109-4117
页数9
ISBN(电子版)9781956792034
出版状态已出版 - 2023
活动32nd International Joint Conference on Artificial Intelligence, IJCAI 2023 - Macao, 中国
期限: 19 8月 202325 8月 2023

出版系列

姓名IJCAI International Joint Conference on Artificial Intelligence
2023-August
ISSN(印刷版)1045-0823

会议

会议32nd International Joint Conference on Artificial Intelligence, IJCAI 2023
国家/地区中国
Macao
时期19/08/2325/08/23

指纹

探究 'An Empirical Study on the Language Modal in Visual Question Answering' 的科研主题。它们共同构成独一无二的指纹。

引用此