Revisiting correlations between intrinsic and extrinsic evaluations of word embeddings

Yuanyuan Qiu, Hongzheng Li, Shen Li, Yingdi Jiang, Renfen Hu*, Lijiao Yang

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

55 引用 (Scopus)

摘要

The evaluation of word embeddings has received a considerable amount of attention in recent years, but there have been some debates about whether intrinsic measures can predict the performance of downstream tasks. To investigate this question, this paper presents the first study on the correlation between results of intrinsic evaluation and extrinsic evaluation with Chinese word embeddings. We use word similarity and word analogy as the intrinsic tasks, Named Entity Recognition and Sentiment Classification as the extrinsic tasks. A variety of Chinese word embeddings trained with different corpora and context features are used in the experiments. From the data analysis, we reach some interesting conclusions: there are strong correlations between intrinsic and extrinsic evaluations, and the performance of different tasks can be affected by training corpora and context features to varying degrees.

源语言英语
主期刊名Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data - 17th China National Conference, CCL 2018, and 6th International Symposium, NLP-NABD 2018, Proceedings
编辑Xiaojie Wang, Ting Liu, Maosong Sun, Zhiyuan Liu, Yang Liu
出版商Springer Verlag
209-221
页数13
ISBN(印刷版)9783030017156
DOI
出版状态已出版 - 2018
活动17th China National Conference on Computational Linguistics, CCL 2018 and 6th International Symposium on Natural Language Processing Based on Naturally Annotated Big Data, NLP-NABD 2018 - Changsha, 中国
期限: 19 10月 201821 10月 2018

出版系列

姓名Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
11221 LNAI
ISSN(印刷版)0302-9743
ISSN(电子版)1611-3349

会议

会议17th China National Conference on Computational Linguistics, CCL 2018 and 6th International Symposium on Natural Language Processing Based on Naturally Annotated Big Data, NLP-NABD 2018
国家/地区中国
Changsha
时期19/10/1821/10/18

指纹

探究 'Revisiting correlations between intrinsic and extrinsic evaluations of word embeddings' 的科研主题。它们共同构成独一无二的指纹。

引用此