Continual Domain Adaption for Neural Machine Translation

Manzhi Yang, Huaping Zhang, Chenxi Yu, Guotong Geng*

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Domain Neural Machine Translation (NMT) with small data- sets requires continual learning to incorporate new knowledge, as catastrophic forgetting is the main challenge that causes the model to forget old knowledge during fine-tuning. Additionally, most studies ignore the multi-stage domain adaptation of NMT. To address these issues, we propose a multi-stage incremental framework for domain NMT based on knowledge distillation. We also analyze how the supervised signals of the golden label and the teacher model work within a stage. Results show that the teacher model can only benefit the student model in the early epochs, while harms it in the later epochs. To solve this problem, we propose using two training objectives to encourage the early and later training. For early epochs, conventional continual learning is retained to fully leverage the teacher model and integrate old knowledge. For the later epochs, the bidirectional marginal loss is used to get rid of the negative impact of the teacher model. The experiments show that our method outperforms multiple continual learning methods, with an average improvement of 1.11 and 1.06 on two domain translation tasks.

源语言英语
主期刊名Neural Information Processing - 30th International Conference, ICONIP 2023, Proceedings
编辑Biao Luo, Long Cheng, Zheng-Guang Wu, Hongyi Li, Chaojie Li
出版商Springer Science and Business Media Deutschland GmbH
427-439
页数13
ISBN(印刷版)9789819981441
DOI
出版状态已出版 - 2024
活动30th International Conference on Neural Information Processing, ICONIP 2023 - Changsha, 中国
期限: 20 11月 202323 11月 2023

出版系列

姓名Communications in Computer and Information Science
1965 CCIS
ISSN(印刷版)1865-0929
ISSN(电子版)1865-0937

会议

会议30th International Conference on Neural Information Processing, ICONIP 2023
国家/地区中国
Changsha
时期20/11/2323/11/23

指纹

探究 'Continual Domain Adaption for Neural Machine Translation' 的科研主题。它们共同构成独一无二的指纹。

引用此