Revisiting correlations between intrinsic and extrinsic evaluations of word embeddings

Yuanyuan Qiu, Hongzheng Li, Shen Li, Yingdi Jiang, Renfen Hu*, Lijiao Yang

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

55 Citations (Scopus)

Abstract

The evaluation of word embeddings has received a considerable amount of attention in recent years, but there have been some debates about whether intrinsic measures can predict the performance of downstream tasks. To investigate this question, this paper presents the first study on the correlation between results of intrinsic evaluation and extrinsic evaluation with Chinese word embeddings. We use word similarity and word analogy as the intrinsic tasks, Named Entity Recognition and Sentiment Classification as the extrinsic tasks. A variety of Chinese word embeddings trained with different corpora and context features are used in the experiments. From the data analysis, we reach some interesting conclusions: there are strong correlations between intrinsic and extrinsic evaluations, and the performance of different tasks can be affected by training corpora and context features to varying degrees.

Original languageEnglish
Title of host publicationChinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data - 17th China National Conference, CCL 2018, and 6th International Symposium, NLP-NABD 2018, Proceedings
EditorsXiaojie Wang, Ting Liu, Maosong Sun, Zhiyuan Liu, Yang Liu
PublisherSpringer Verlag
Pages209-221
Number of pages13
ISBN (Print)9783030017156
DOIs
Publication statusPublished - 2018
Event17th China National Conference on Computational Linguistics, CCL 2018 and 6th International Symposium on Natural Language Processing Based on Naturally Annotated Big Data, NLP-NABD 2018 - Changsha, China
Duration: 19 Oct 201821 Oct 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11221 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference17th China National Conference on Computational Linguistics, CCL 2018 and 6th International Symposium on Natural Language Processing Based on Naturally Annotated Big Data, NLP-NABD 2018
Country/TerritoryChina
CityChangsha
Period19/10/1821/10/18

Keywords

  • Extrinsic evaluation
  • Intrinsic evaluation
  • Word embedding

Fingerprint

Dive into the research topics of 'Revisiting correlations between intrinsic and extrinsic evaluations of word embeddings'. Together they form a unique fingerprint.

Cite this