Query Prior Matters: A MRC Framework for Multimodal Named Entity Recognition

Meihuizi Jia, Xin Shen, Lei Shen, Jinhui Pang*, Lejian Liao, Yang Song, Meng Chen, Xiaodong He

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

36 引用 (Scopus)
Plum Print visual indicator of research metrics
  • Citations
    • Citation Indexes: 35
  • Captures
    • Readers: 18
see details

摘要

Multimodal named entity recognition (MNER) is a vision-language task where the system is required to detect entity spans and corresponding entity types given a sentence-image pair. Existing methods capture text-image relations with various attention mechanisms that only obtain implicit alignments between entity types and image regions. To locate regions more accurately and better model cross-/within-modal relations, we propose a machine reading comprehension based framework for MNER, namely MRC-MNER. By utilizing queries in MRC, our framework can provide prior information about entity types and image regions. Specifically, we design two stages, Query-Guided Visual Grounding and Multi-Level Modal Interaction, to align fine-grained type-region information and simulate text-image/inner-text interactions respectively. For the former, we train a visual grounding model via transfer learning to extract region candidates that can be further integrated into the second stage to enhance token representations. For the latter, we design text-image and inner-text interaction modules along with three sub-tasks for MRC-MNER. To verify the effectiveness of our model, we conduct extensive experiments on two public MNER datasets, Twitter2015 and Twitter2017. Experimental results show that MRC-MNER outperforms the current state-of-the-art models on Twitter2017, and yields competitive results on Twitter2015.

源语言英语
主期刊名MM 2022 - Proceedings of the 30th ACM International Conference on Multimedia
出版商Association for Computing Machinery, Inc
3549-3558
页数10
ISBN(电子版)9781450392037
DOI
出版状态已出版 - 10 10月 2022
已对外发布
活动30th ACM International Conference on Multimedia, MM 2022 - Lisboa, 葡萄牙
期限: 10 10月 202214 10月 2022

出版系列

姓名MM 2022 - Proceedings of the 30th ACM International Conference on Multimedia

会议

会议30th ACM International Conference on Multimedia, MM 2022
国家/地区葡萄牙
Lisboa
时期10/10/2214/10/22

指纹

探究 'Query Prior Matters: A MRC Framework for Multimodal Named Entity Recognition' 的科研主题。它们共同构成独一无二的指纹。

引用此

Jia, M., Shen, X., Shen, L., Pang, J., Liao, L., Song, Y., Chen, M., & He, X. (2022). Query Prior Matters: A MRC Framework for Multimodal Named Entity Recognition. 在 MM 2022 - Proceedings of the 30th ACM International Conference on Multimedia (页码 3549-3558). (MM 2022 - Proceedings of the 30th ACM International Conference on Multimedia). Association for Computing Machinery, Inc. https://doi.org/10.1145/3503161.3548427