INFOXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training

Zewen Chi, Li Dong, Furu Wei, Nan Yang, Saksham Singhal, Wenhui Wang, Xia Song, Xian Ling Mao, Heyan Huang*, Ming Zhou

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

183 引用 (Scopus)

摘要

In this work, we present an information-theoretic framework that formulates cross-lingual language model pre-training as maximizing mutual information between multilingual-multi-granularity texts. The unified view helps us to better understand the existing methods for learning cross-lingual representations. More importantly, inspired by the framework, we propose a new pretraining task based on contrastive learning. Specifically, we regard a bilingual sentence pair as two views of the same meaning and encourage their encoded representations to be more similar than the negative examples. By leveraging both monolingual and parallel corpora, we jointly train the pretext tasks to improve the cross-lingual transferability of pre-trained models. Experimental results on several benchmarks show that our approach achieves considerably better performance. The code and pre-trained models are available at https://aka.ms/infoxlm.

源语言英语
主期刊名NAACL-HLT 2021 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics
主期刊副标题Human Language Technologies, Proceedings of the Conference
出版商Association for Computational Linguistics (ACL)
3576-3588
页数13
ISBN(电子版)9781954085466
出版状态已出版 - 2021
活动2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2021 - Virtual, Online
期限: 6 6月 202111 6月 2021

出版系列

姓名NAACL-HLT 2021 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference

会议

会议2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2021
Virtual, Online
时期6/06/2111/06/21

指纹

探究 'INFOXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training' 的科研主题。它们共同构成独一无二的指纹。

引用此