Can Cross-Lingual Transferability of Multilingual Transformers Be Activated Without End-Task Data?

Zewen Chi, Heyan Huang*, Xian Ling Mao

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Pretrained multilingual Transformers have achieved great success in cross-lingual transfer learning. Current methods typically activate the cross-lingual transferability of multilingual Transformers by fine-tuning them on end-task data. However, the methods cannot perform cross-lingual transfer when end-task data are unavailable. In this work, we explore whether the cross-lingual transferability can be activated without end-task data. We propose a cross-lingual transfer method, named PLUGIN-X. PLUGIN-X disassembles monolingual and multilingual Transformers into sub-modules, and reassembles them to be the multilingual end-task model. After representation adaptation, PLUGIN-X finally performs cross-lingual transfer in a plug-and-play style. Experimental results show that PLUGIN-X successfully activates the cross-lingual transferability of multilingual Transformers without accessing end-task data. Moreover, we analyze how the cross-model representation alignment affects the cross-lingual transferability.

Original languageEnglish
Title of host publicationFindings of the Association for Computational Linguistics, ACL 2023
PublisherAssociation for Computational Linguistics (ACL)
Pages12572-12584
Number of pages13
ISBN (Electronic)9781959429623
Publication statusPublished - 2023
Event61st Annual Meeting of the Association for Computational Linguistics, ACL 2023 - Toronto, Canada
Duration: 9 Jul 202314 Jul 2023

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics
ISSN (Print)0736-587X

Conference

Conference61st Annual Meeting of the Association for Computational Linguistics, ACL 2023
Country/TerritoryCanada
CityToronto
Period9/07/2314/07/23

Fingerprint

Dive into the research topics of 'Can Cross-Lingual Transferability of Multilingual Transformers Be Activated Without End-Task Data?'. Together they form a unique fingerprint.

Cite this