TY - GEN
T1 - An In-depth Interactive and Visualized Platform for Evaluating and Analyzing MRC Models
AU - Wu, Zhijing
AU - Fang, Jingliang
AU - Xu, Hua
AU - Gao, Kai
N1 - Publisher Copyright:
© 2022 Owner/Author.
PY - 2022/10/17
Y1 - 2022/10/17
N2 - Machine Reading Comprehension (MRC) has made leaps and bounds when focusing on answering questions. However, since the existing accuracy-based evaluation metrics are agnostic to the nuances of neural networks, the true understanding and inferencing abilities of MRC models remain largely unknown. To address the above limitations, InDepth-Eva-MRC, an interactive and visualized platform, is proposed to provide analysis from cognitive fine-grained for MRC models. Concretely, the platform makes post-hoc systems to explain the behavior of MRC models. On the one hand, it analyzes the linguistic bias via performances with different linguistic properties. On the other hand, it performs skill-based analysis methods based on the modified test samples and semi-automatically generated test samples. Furthermore, through its detailed and interactive visualizations, the platform offers in-depth results analysis and model comparison from cognitive fine-grained. A screencast video and additional external material are available on https://github.com/thuiar/InDepth-Eva-MRC.
AB - Machine Reading Comprehension (MRC) has made leaps and bounds when focusing on answering questions. However, since the existing accuracy-based evaluation metrics are agnostic to the nuances of neural networks, the true understanding and inferencing abilities of MRC models remain largely unknown. To address the above limitations, InDepth-Eva-MRC, an interactive and visualized platform, is proposed to provide analysis from cognitive fine-grained for MRC models. Concretely, the platform makes post-hoc systems to explain the behavior of MRC models. On the one hand, it analyzes the linguistic bias via performances with different linguistic properties. On the other hand, it performs skill-based analysis methods based on the modified test samples and semi-automatically generated test samples. Furthermore, through its detailed and interactive visualizations, the platform offers in-depth results analysis and model comparison from cognitive fine-grained. A screencast video and additional external material are available on https://github.com/thuiar/InDepth-Eva-MRC.
KW - evaluation
KW - in-depth analysis
KW - machine reading comprehension
UR - http://www.scopus.com/inward/record.url?scp=85140826166&partnerID=8YFLogxK
U2 - 10.1145/3511808.3557167
DO - 10.1145/3511808.3557167
M3 - Conference contribution
AN - SCOPUS:85140826166
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 5044
EP - 5048
BT - CIKM 2022 - Proceedings of the 31st ACM International Conference on Information and Knowledge Management
PB - Association for Computing Machinery
T2 - 31st ACM International Conference on Information and Knowledge Management, CIKM 2022
Y2 - 17 October 2022 through 21 October 2022
ER -