TY - JOUR
T1 - Evaluation of graph convolutional networks performance for visual question answering on reasoning datasets
AU - Yusuf, Abdulganiyu Abdu
AU - Chong, Feng
AU - Xianling, Mao
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
PY - 2022/11
Y1 - 2022/11
N2 - In the recent era, graph neural networks are widely used on vision-to-language tasks and achieved promising results. In particular, graph convolution network (GCN) is capable of capturing spatial and semantic relationships needed for visual question answering (VQA). But, applying GCN on VQA datasets with different subtasks can lead to varying results. Also, the training and testing size, evaluation metrics and hyperparameter used are other factors that affect VQA results. These, factors can be subjected into similar evaluation schemes in order to obtain fair evaluations of GCN based result for VQA. This study proposed a GCN framework for VQA based on fine tune word representation to solve handle reasoning type questions. The framework performance is evaluated using various performance measures. The results obtained from GQA and VQA 2.0 datasets slightly outperform most existing methods.
AB - In the recent era, graph neural networks are widely used on vision-to-language tasks and achieved promising results. In particular, graph convolution network (GCN) is capable of capturing spatial and semantic relationships needed for visual question answering (VQA). But, applying GCN on VQA datasets with different subtasks can lead to varying results. Also, the training and testing size, evaluation metrics and hyperparameter used are other factors that affect VQA results. These, factors can be subjected into similar evaluation schemes in order to obtain fair evaluations of GCN based result for VQA. This study proposed a GCN framework for VQA based on fine tune word representation to solve handle reasoning type questions. The framework performance is evaluated using various performance measures. The results obtained from GQA and VQA 2.0 datasets slightly outperform most existing methods.
KW - Fine-tuned representation
KW - GCN
KW - Performance measure
KW - Reasoning datasets
KW - VQA
UR - https://www.scopus.com/pages/publications/85129480969
U2 - 10.1007/s11042-022-13065-x
DO - 10.1007/s11042-022-13065-x
M3 - Article
AN - SCOPUS:85129480969
SN - 1380-7501
VL - 81
SP - 40361
EP - 40370
JO - Multimedia Tools and Applications
JF - Multimedia Tools and Applications
IS - 28
ER -