TED-EL: A Corpus for Speech Entity Linking

Silin Li, Ruoyu Song, Tianwei Lan, Zeming Liu, Yuhang Guo*

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Speech entity linking amis to recognize mentions from speech and link them to entities in knowledge bases. Previous work on entity linking mainly focuses on visual context and text context. In contrast, speech entity linking focuses on audio context. In this paper, we first propose the speech entity linking task. To facilitate the study of this task, we propose the first speech entity linking dataset, TED-EL. Our corpus is a high-quality, human-annotated, audio, text, and mention-entity pair parallel dataset derived from Technology, Entertainment, Design (TED) talks and includes a wide range of entity types (24 types). Based on TED-EL, we designed two types of models: ranking-based and generative speech entity linking models. We conducted experiments on the TED-EL dataset for both types of models. The results show that our ranking-based models outperform the generative models, achieving an F1 score of 60.68%.

源语言英语
主期刊名2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings
编辑Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
出版商European Language Resources Association (ELRA)
15721-15731
页数11
ISBN(电子版)9782493814104
出版状态已出版 - 2024
活动Joint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024 - Hybrid, Torino, 意大利
期限: 20 5月 202425 5月 2024

出版系列

姓名2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings

会议

会议Joint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024
国家/地区意大利
Hybrid, Torino
时期20/05/2425/05/24

指纹

探究 'TED-EL: A Corpus for Speech Entity Linking' 的科研主题。它们共同构成独一无二的指纹。

引用此