XLM-E: Cross-lingual Language Model Pre-training via ELECTRA

Zewen Chi; Shaohan Huang; Li Dong; Shuming Ma; Bo Zheng; Saksham Singhal; Payal Bajaj; Xia Song; Xian Ling Mao; Heyan Huang; Furu Wei

XLM-E: Cross-lingual Language Model Pre-training via ELECTRA

Zewen Chi, Shaohan Huang, Li Dong, Shuming Ma, Bo Zheng, Saksham Singhal, Payal Bajaj, Xia Song, Xian Ling Mao, Heyan Huang, Furu Wei

School of Computer Science and Technology

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

54 Citations (Scopus)

Abstract

In this paper, we introduce ELECTRA-style tasks (Clark et al., 2020b) to cross-lingual language model pre-training. Specifically, we present two pre-training tasks, namely multilingual replaced token detection, and translation replaced token detection. Besides, we pretrain the model, named as XLM-E, on both multilingual and parallel corpora. Our model outperforms the baseline models on various cross-lingual understanding tasks with much less computation cost. Moreover, analysis shows that XLM-E tends to obtain better cross-lingual transferability.

Original language	English
Title of host publication	ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers)
Editors	Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Publisher	Association for Computational Linguistics (ACL)
Pages	6170-6182
Number of pages	13
ISBN (Electronic)	9781955917216
Publication status	Published - 2022
Event	60th Annual Meeting of the Association for Computational Linguistics, ACL 2022 - Dublin, Ireland Duration: 22 May 2022 → 27 May 2022

Publication series

Name	Proceedings of the Annual Meeting of the Association for Computational Linguistics
Volume	1
ISSN (Print)	0736-587X

Conference

Conference	60th Annual Meeting of the Association for Computational Linguistics, ACL 2022
Country/Territory	Ireland
City	Dublin
Period	22/05/22 → 27/05/22

Cite this

Chi, Z., Huang, S., Dong, L., Ma, S., Zheng, B., Singhal, S., Bajaj, P., Song, X., Mao, X. L., Huang, H., & Wei, F. (2022). XLM-E: Cross-lingual Language Model Pre-training via ELECTRA. In S. Muresan, P. Nakov, & A. Villavicencio (Eds.), ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers) (pp. 6170-6182). (Proceedings of the Annual Meeting of the Association for Computational Linguistics; Vol. 1). Association for Computational Linguistics (ACL).

Chi, Zewen ; Huang, Shaohan ; Dong, Li et al. / XLM-E : Cross-lingual Language Model Pre-training via ELECTRA. ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers). editor / Smaranda Muresan ; Preslav Nakov ; Aline Villavicencio. Association for Computational Linguistics (ACL), 2022. pp. 6170-6182 (Proceedings of the Annual Meeting of the Association for Computational Linguistics).

@inproceedings{bf22aa4ec15947c5bc88942624eac2a8,

title = "XLM-E: Cross-lingual Language Model Pre-training via ELECTRA",

abstract = "In this paper, we introduce ELECTRA-style tasks (Clark et al., 2020b) to cross-lingual language model pre-training. Specifically, we present two pre-training tasks, namely multilingual replaced token detection, and translation replaced token detection. Besides, we pretrain the model, named as XLM-E, on both multilingual and parallel corpora. Our model outperforms the baseline models on various cross-lingual understanding tasks with much less computation cost. Moreover, analysis shows that XLM-E tends to obtain better cross-lingual transferability.",

author = "Zewen Chi and Shaohan Huang and Li Dong and Shuming Ma and Bo Zheng and Saksham Singhal and Payal Bajaj and Xia Song and Mao, {Xian Ling} and Heyan Huang and Furu Wei",

note = "Publisher Copyright: {\textcopyright} 2022 Association for Computational Linguistics.; 60th Annual Meeting of the Association for Computational Linguistics, ACL 2022 ; Conference date: 22-05-2022 Through 27-05-2022",

year = "2022",

language = "English",

series = "Proceedings of the Annual Meeting of the Association for Computational Linguistics",

publisher = "Association for Computational Linguistics (ACL)",

pages = "6170--6182",

editor = "Smaranda Muresan and Preslav Nakov and Aline Villavicencio",

booktitle = "ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers)",

address = "United States",

}

Chi, Z, Huang, S, Dong, L, Ma, S, Zheng, B, Singhal, S, Bajaj, P, Song, X, Mao, XL, Huang, H & Wei, F 2022, XLM-E: Cross-lingual Language Model Pre-training via ELECTRA. in S Muresan, P Nakov & A Villavicencio (eds), ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers). Proceedings of the Annual Meeting of the Association for Computational Linguistics, vol. 1, Association for Computational Linguistics (ACL), pp. 6170-6182, 60th Annual Meeting of the Association for Computational Linguistics, ACL 2022, Dublin, Ireland, 22/05/22.

XLM-E: Cross-lingual Language Model Pre-training via ELECTRA. / Chi, Zewen; Huang, Shaohan; Dong, Li et al.
ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers). ed. / Smaranda Muresan; Preslav Nakov; Aline Villavicencio. Association for Computational Linguistics (ACL), 2022. p. 6170-6182 (Proceedings of the Annual Meeting of the Association for Computational Linguistics; Vol. 1).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - XLM-E

T2 - 60th Annual Meeting of the Association for Computational Linguistics, ACL 2022

AU - Chi, Zewen

AU - Huang, Shaohan

AU - Dong, Li

AU - Ma, Shuming

AU - Zheng, Bo

AU - Singhal, Saksham

AU - Bajaj, Payal

AU - Song, Xia

AU - Mao, Xian Ling

AU - Huang, Heyan

AU - Wei, Furu

PY - 2022

Y1 - 2022

N2 - In this paper, we introduce ELECTRA-style tasks (Clark et al., 2020b) to cross-lingual language model pre-training. Specifically, we present two pre-training tasks, namely multilingual replaced token detection, and translation replaced token detection. Besides, we pretrain the model, named as XLM-E, on both multilingual and parallel corpora. Our model outperforms the baseline models on various cross-lingual understanding tasks with much less computation cost. Moreover, analysis shows that XLM-E tends to obtain better cross-lingual transferability.

AB - In this paper, we introduce ELECTRA-style tasks (Clark et al., 2020b) to cross-lingual language model pre-training. Specifically, we present two pre-training tasks, namely multilingual replaced token detection, and translation replaced token detection. Besides, we pretrain the model, named as XLM-E, on both multilingual and parallel corpora. Our model outperforms the baseline models on various cross-lingual understanding tasks with much less computation cost. Moreover, analysis shows that XLM-E tends to obtain better cross-lingual transferability.

UR - http://www.scopus.com/inward/record.url?scp=85140387868&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85140387868

T3 - Proceedings of the Annual Meeting of the Association for Computational Linguistics

SP - 6170

EP - 6182

BT - ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers)

A2 - Muresan, Smaranda

A2 - Nakov, Preslav

A2 - Villavicencio, Aline

PB - Association for Computational Linguistics (ACL)

Y2 - 22 May 2022 through 27 May 2022

ER -

Chi Z, Huang S, Dong L, Ma S, Zheng B, Singhal S et al. XLM-E: Cross-lingual Language Model Pre-training via ELECTRA. In Muresan S, Nakov P, Villavicencio A, editors, ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers). Association for Computational Linguistics (ACL). 2022. p. 6170-6182. (Proceedings of the Annual Meeting of the Association for Computational Linguistics).

XLM-E: Cross-lingual Language Model Pre-training via ELECTRA

Abstract

Publication series

Conference

Other files and links

Fingerprint

Cite this