INFOXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training

Zewen Chi, Li Dong, Furu Wei, Nan Yang, Saksham Singhal, Wenhui Wang, Xia Song, Xian Ling Mao, Heyan Huang*, Ming Zhou

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

183 Citations (Scopus)

Abstract

In this work, we present an information-theoretic framework that formulates cross-lingual language model pre-training as maximizing mutual information between multilingual-multi-granularity texts. The unified view helps us to better understand the existing methods for learning cross-lingual representations. More importantly, inspired by the framework, we propose a new pretraining task based on contrastive learning. Specifically, we regard a bilingual sentence pair as two views of the same meaning and encourage their encoded representations to be more similar than the negative examples. By leveraging both monolingual and parallel corpora, we jointly train the pretext tasks to improve the cross-lingual transferability of pre-trained models. Experimental results on several benchmarks show that our approach achieves considerably better performance. The code and pre-trained models are available at https://aka.ms/infoxlm.

Original languageEnglish
Title of host publicationNAACL-HLT 2021 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics
Subtitle of host publicationHuman Language Technologies, Proceedings of the Conference
PublisherAssociation for Computational Linguistics (ACL)
Pages3576-3588
Number of pages13
ISBN (Electronic)9781954085466
Publication statusPublished - 2021
Event2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2021 - Virtual, Online
Duration: 6 Jun 202111 Jun 2021

Publication series

NameNAACL-HLT 2021 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference

Conference

Conference2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2021
CityVirtual, Online
Period6/06/2111/06/21

Fingerprint

Dive into the research topics of 'INFOXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training'. Together they form a unique fingerprint.

Cite this