TY - JOUR
T1 - Learning Domain Specific Sub-layer Latent Variable for Multi-Domain Adaptation Neural Machine Translation
AU - Huang, Shuanghong
AU - Feng, Chong
AU - Shi, Ge
AU - Li, Zhengjun
AU - Zhao, Xuan
AU - Li, Xinyan
AU - Wang, Xiaomei
N1 - Publisher Copyright:
Copyright © 2024 held by the owner/author(s). Publication rights licensed to ACM.
PY - 2024/6/21
Y1 - 2024/6/21
N2 - Domain adaptation proves to be an effective solution for addressing inadequate translation performance within specific domains. However, the straightforward approach of mixing data from multiple domains to obtain the multi-domain neural machine translation (NMT) model can give rise to the parameter interference between domains problem, resulting in a degradation of overall performance. To address this, we introduce a multi-domain adaptive NMT method aimed at learning domain specific sub-layer latent variable and employ the Gumbel-Softmax reparameterization technique to concurrently train both model parameters and domain specific sub-layer latent variable. This approach facilitates learning private domain-specific knowledge while sharing common domain-invariant knowledge, effectively mitigating the parameter interference problem. The experimental results show that our proposed method significantly improved by up to 7.68 and 3.71 BLEU compared with the baseline model in English-German and Chinese-English public multi-domain datasets, respectively.
AB - Domain adaptation proves to be an effective solution for addressing inadequate translation performance within specific domains. However, the straightforward approach of mixing data from multiple domains to obtain the multi-domain neural machine translation (NMT) model can give rise to the parameter interference between domains problem, resulting in a degradation of overall performance. To address this, we introduce a multi-domain adaptive NMT method aimed at learning domain specific sub-layer latent variable and employ the Gumbel-Softmax reparameterization technique to concurrently train both model parameters and domain specific sub-layer latent variable. This approach facilitates learning private domain-specific knowledge while sharing common domain-invariant knowledge, effectively mitigating the parameter interference problem. The experimental results show that our proposed method significantly improved by up to 7.68 and 3.71 BLEU compared with the baseline model in English-German and Chinese-English public multi-domain datasets, respectively.
KW - multi-domain adaptation
KW - Neural machine translation
KW - parameter interference
KW - sub-layer latent variable
UR - http://www.scopus.com/inward/record.url?scp=85197368626&partnerID=8YFLogxK
U2 - 10.1145/3661305
DO - 10.1145/3661305
M3 - Article
AN - SCOPUS:85197368626
SN - 2375-4699
VL - 23
JO - ACM Transactions on Asian and Low-Resource Language Information Processing
JF - ACM Transactions on Asian and Low-Resource Language Information Processing
IS - 6
M1 - 78
ER -