TY - GEN
T1 - Fast and Accurate Bilingual Lexicon Induction via Matching Optimization
AU - Chi, Zewen
AU - Huang, Heyan
AU - Zhao, Shenjian
AU - Xu, Heng Da
AU - Mao, Xian Ling
N1 - Publisher Copyright:
© 2019, Springer Nature Switzerland AG.
PY - 2019
Y1 - 2019
N2 - Most recent state-of-the-art approaches are proposed to utilize the pre-trained word embeddings for bilingual lexicon induction. However, the word embeddings introduce noises for both frequent and rare words. Especially in the case of rare words, embeddings of which are always not well learned due to their low occurrence in the training data. In order to alleviate the above problem, we propose BLIMO, a simple yet effective approach for automatic lexicon induction. It does not introduce word embeddings but converts the lexicon induction problem into a maximum weighted matching problem, which could be efficiently solved by the matching optimization with greedy search. Empirical experiments further demonstrate that our proposed method outperforms state-of-the-arts baselines greatly on two standard benchmarks.
AB - Most recent state-of-the-art approaches are proposed to utilize the pre-trained word embeddings for bilingual lexicon induction. However, the word embeddings introduce noises for both frequent and rare words. Especially in the case of rare words, embeddings of which are always not well learned due to their low occurrence in the training data. In order to alleviate the above problem, we propose BLIMO, a simple yet effective approach for automatic lexicon induction. It does not introduce word embeddings but converts the lexicon induction problem into a maximum weighted matching problem, which could be efficiently solved by the matching optimization with greedy search. Empirical experiments further demonstrate that our proposed method outperforms state-of-the-arts baselines greatly on two standard benchmarks.
UR - http://www.scopus.com/inward/record.url?scp=85075558804&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-32233-5_57
DO - 10.1007/978-3-030-32233-5_57
M3 - Conference contribution
AN - SCOPUS:85075558804
SN - 9783030322328
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 737
EP - 748
BT - Natural Language Processing and Chinese Computing - 8th CCF International Conference, NLPCC 2019, Proceedings
A2 - Tang, Jie
A2 - Kan, Min-Yen
A2 - Zhao, Dongyan
A2 - Li, Sujian
A2 - Zan, Hongying
PB - Springer
T2 - 8th CCF International Conference on Natural Language Processing and Chinese Computing, NLPCC 2019
Y2 - 9 October 2019 through 14 October 2019
ER -