Fast and Accurate Bilingual Lexicon Induction via Matching Optimization

Zewen Chi, Heyan Huang*, Shenjian Zhao, Heng Da Xu, Xian Ling Mao

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Most recent state-of-the-art approaches are proposed to utilize the pre-trained word embeddings for bilingual lexicon induction. However, the word embeddings introduce noises for both frequent and rare words. Especially in the case of rare words, embeddings of which are always not well learned due to their low occurrence in the training data. In order to alleviate the above problem, we propose BLIMO, a simple yet effective approach for automatic lexicon induction. It does not introduce word embeddings but converts the lexicon induction problem into a maximum weighted matching problem, which could be efficiently solved by the matching optimization with greedy search. Empirical experiments further demonstrate that our proposed method outperforms state-of-the-arts baselines greatly on two standard benchmarks.

源语言英语
主期刊名Natural Language Processing and Chinese Computing - 8th CCF International Conference, NLPCC 2019, Proceedings
编辑Jie Tang, Min-Yen Kan, Dongyan Zhao, Sujian Li, Hongying Zan
出版商Springer
737-748
页数12
ISBN(印刷版)9783030322328
DOI
出版状态已出版 - 2019
活动8th CCF International Conference on Natural Language Processing and Chinese Computing, NLPCC 2019 - Dunhuang, 中国
期限: 9 10月 201914 10月 2019

出版系列

姓名Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
11838 LNAI
ISSN(印刷版)0302-9743
ISSN(电子版)1611-3349

会议

会议8th CCF International Conference on Natural Language Processing and Chinese Computing, NLPCC 2019
国家/地区中国
Dunhuang
时期9/10/1914/10/19

指纹

探究 'Fast and Accurate Bilingual Lexicon Induction via Matching Optimization' 的科研主题。它们共同构成独一无二的指纹。

引用此