Enhancing Zero-Shot Translation in Multilingual Neural Machine Translation: Focusing on Obtaining Location-Agnostic Representations

Jiarui Zhang*, Heyan Huang*, Yue Hu, Ping Guo

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In the field of multilingual neural machine translation, a notable challenge is zero-shot translation, where a model translates languages that it has not been trained on. This often results in poor translation quality, mainly because the model’s internal language representations are too specific to its training languages. We illustrate that the positional relationship to input tokens is a primary factor contributing to the language-specific representations. We find a solution by modifying the model’s structure, specifically by removing certain connections in its encoder layer. This simple change significantly improves the quality of zero-shot translations, with an increase of up to 11.1 BLEU points, a measure of translation accuracy. Importantly, this improvement does not affect the quality of translations for the languages the model was trained on. Besides, our method facilitates the seamless incorporation of new languages, significantly broadening the scope of translation coverage.

Original languageEnglish
Title of host publicationArtificial Neural Networks and Machine Learning – ICANN 2024 - 33rd International Conference on Artificial Neural Networks, Proceedings
EditorsMichael Wand, Jürgen Schmidhuber, Michael Wand, Kristína Malinovská, Jürgen Schmidhuber, Igor V. Tetko, Igor V. Tetko
PublisherSpringer Science and Business Media Deutschland GmbH
Pages194-208
Number of pages15
ISBN (Print)9783031723490
DOIs
Publication statusPublished - 2024
Event33rd International Conference on Artificial Neural Networks, ICANN 2024 - Lugano, Switzerland
Duration: 17 Sept 202420 Sept 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume15022 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference33rd International Conference on Artificial Neural Networks, ICANN 2024
Country/TerritorySwitzerland
CityLugano
Period17/09/2420/09/24

Keywords

  • location-agnostic representations
  • multilingual neural machine translation
  • removing certain connections
  • zero-shot translation

Fingerprint

Dive into the research topics of 'Enhancing Zero-Shot Translation in Multilingual Neural Machine Translation: Focusing on Obtaining Location-Agnostic Representations'. Together they form a unique fingerprint.

Cite this