An Enhanced New Word Identification Approach Using Bilingual Alignment

Ziyan Yang, Huaping Zhang*, Jianyun Shang, Silamu Wushour

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Traditional new word detection focused on finding the positional distribution of new words on Chinese text, but rarely on other languages. It was also difficult to obtain semantic information or translations of these new words. This paper proposed NEWBA, an enhanced new word identification algorithm by using bilingual corpus alignment. It indicated that NEWBA performs better than the traditional unsupervised method. In addition, it can obtain bilingual word pairs, which was able to provide us with translations beyond detection. NEWBA can expand the scope of traditional new word detection and therefore obtain more valuable information from bilingual aligned corpora.

源语言英语
主期刊名Natural Language Processing and Chinese Computing - 11th CCF International Conference, NLPCC 2022, Proceedings
编辑Wei Lu, Shujian Huang, Yu Hong, Xiabing Zhou
出版商Springer Science and Business Media Deutschland GmbH
92-104
页数13
ISBN(印刷版)9783031171192
DOI
出版状态已出版 - 2022
活动11th CCF International Conference on Natural Language Processing and Chinese Computing, NLPCC 2022 - Guilin, 中国
期限: 24 9月 202225 9月 2022

出版系列

姓名Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
13551 LNAI
ISSN(印刷版)0302-9743
ISSN(电子版)1611-3349

会议

会议11th CCF International Conference on Natural Language Processing and Chinese Computing, NLPCC 2022
国家/地区中国
Guilin
时期24/09/2225/09/22

引用此