XLM-E: Cross-lingual Language Model Pre-training via ELECTRA

Zewen Chi; Shaohan Huang; Li Dong; Shuming Ma; Bo Zheng; Saksham Singhal; Payal Bajaj; Xia Song; Xian Ling Mao; Heyan Huang; Furu Wei

XLM-E: Cross-lingual Language Model Pre-training via ELECTRA

Zewen Chi, Shaohan Huang, Li Dong, Shuming Ma, Bo Zheng, Saksham Singhal, Payal Bajaj, Xia Song, Xian Ling Mao, Heyan Huang, Furu Wei

计算机学院

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

54 引用（Scopus）

摘要

In this paper, we introduce ELECTRA-style tasks (Clark et al., 2020b) to cross-lingual language model pre-training. Specifically, we present two pre-training tasks, namely multilingual replaced token detection, and translation replaced token detection. Besides, we pretrain the model, named as XLM-E, on both multilingual and parallel corpora. Our model outperforms the baseline models on various cross-lingual understanding tasks with much less computation cost. Moreover, analysis shows that XLM-E tends to obtain better cross-lingual transferability.

源语言	英语
主期刊名	ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers)
编辑	Smaranda Muresan, Preslav Nakov, Aline Villavicencio
出版商	Association for Computational Linguistics (ACL)
页	6170-6182
页数	13
ISBN（电子版）	9781955917216
出版状态	已出版 - 2022
活动	60th Annual Meeting of the Association for Computational Linguistics, ACL 2022 - Dublin, 爱尔兰期限: 22 5月 2022 → 27 5月 2022

出版系列

姓名	Proceedings of the Annual Meeting of the Association for Computational Linguistics
卷	1
ISSN（印刷版）	0736-587X

会议

会议	60th Annual Meeting of the Association for Computational Linguistics, ACL 2022
国家/地区	爱尔兰
市	Dublin
时期	22/05/22 → 27/05/22

其它文件与链接

链接到 Scopus 的出版物

引用此

Chi, Z., Huang, S., Dong, L., Ma, S., Zheng, B., Singhal, S., Bajaj, P., Song, X., Mao, X. L., Huang, H., & Wei, F. (2022). XLM-E: Cross-lingual Language Model Pre-training via ELECTRA. 在 S. Muresan, P. Nakov, & A. Villavicencio (编辑), ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers) (页码 6170-6182). (Proceedings of the Annual Meeting of the Association for Computational Linguistics; 卷 1). Association for Computational Linguistics (ACL).

Chi, Zewen ; Huang, Shaohan ; Dong, Li 等. / XLM-E : Cross-lingual Language Model Pre-training via ELECTRA. ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers). 编辑 / Smaranda Muresan ; Preslav Nakov ; Aline Villavicencio. Association for Computational Linguistics (ACL), 2022. 页码 6170-6182 (Proceedings of the Annual Meeting of the Association for Computational Linguistics).

@inproceedings{bf22aa4ec15947c5bc88942624eac2a8,

title = "XLM-E: Cross-lingual Language Model Pre-training via ELECTRA",

abstract = "In this paper, we introduce ELECTRA-style tasks (Clark et al., 2020b) to cross-lingual language model pre-training. Specifically, we present two pre-training tasks, namely multilingual replaced token detection, and translation replaced token detection. Besides, we pretrain the model, named as XLM-E, on both multilingual and parallel corpora. Our model outperforms the baseline models on various cross-lingual understanding tasks with much less computation cost. Moreover, analysis shows that XLM-E tends to obtain better cross-lingual transferability.",

author = "Zewen Chi and Shaohan Huang and Li Dong and Shuming Ma and Bo Zheng and Saksham Singhal and Payal Bajaj and Xia Song and Mao, {Xian Ling} and Heyan Huang and Furu Wei",

note = "Publisher Copyright: {\textcopyright} 2022 Association for Computational Linguistics.; 60th Annual Meeting of the Association for Computational Linguistics, ACL 2022 ; Conference date: 22-05-2022 Through 27-05-2022",

year = "2022",

language = "English",

series = "Proceedings of the Annual Meeting of the Association for Computational Linguistics",

publisher = "Association for Computational Linguistics (ACL)",

pages = "6170--6182",

editor = "Smaranda Muresan and Preslav Nakov and Aline Villavicencio",

booktitle = "ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers)",

address = "United States",

}

Chi, Z, Huang, S, Dong, L, Ma, S, Zheng, B, Singhal, S, Bajaj, P, Song, X, Mao, XL, Huang, H & Wei, F 2022, XLM-E: Cross-lingual Language Model Pre-training via ELECTRA. 在 S Muresan, P Nakov & A Villavicencio (编辑), ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers). Proceedings of the Annual Meeting of the Association for Computational Linguistics, 卷 1, Association for Computational Linguistics (ACL), 页码 6170-6182, 60th Annual Meeting of the Association for Computational Linguistics, ACL 2022, Dublin, 爱尔兰, 22/05/22.

XLM-E: Cross-lingual Language Model Pre-training via ELECTRA. / Chi, Zewen; Huang, Shaohan; Dong, Li 等.
ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers). 编辑 / Smaranda Muresan; Preslav Nakov; Aline Villavicencio. Association for Computational Linguistics (ACL), 2022. 页码 6170-6182 (Proceedings of the Annual Meeting of the Association for Computational Linguistics; 卷 1).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - XLM-E

T2 - 60th Annual Meeting of the Association for Computational Linguistics, ACL 2022

AU - Chi, Zewen

AU - Huang, Shaohan

AU - Dong, Li

AU - Ma, Shuming

AU - Zheng, Bo

AU - Singhal, Saksham

AU - Bajaj, Payal

AU - Song, Xia

AU - Mao, Xian Ling

AU - Huang, Heyan

AU - Wei, Furu

PY - 2022

Y1 - 2022

N2 - In this paper, we introduce ELECTRA-style tasks (Clark et al., 2020b) to cross-lingual language model pre-training. Specifically, we present two pre-training tasks, namely multilingual replaced token detection, and translation replaced token detection. Besides, we pretrain the model, named as XLM-E, on both multilingual and parallel corpora. Our model outperforms the baseline models on various cross-lingual understanding tasks with much less computation cost. Moreover, analysis shows that XLM-E tends to obtain better cross-lingual transferability.

AB - In this paper, we introduce ELECTRA-style tasks (Clark et al., 2020b) to cross-lingual language model pre-training. Specifically, we present two pre-training tasks, namely multilingual replaced token detection, and translation replaced token detection. Besides, we pretrain the model, named as XLM-E, on both multilingual and parallel corpora. Our model outperforms the baseline models on various cross-lingual understanding tasks with much less computation cost. Moreover, analysis shows that XLM-E tends to obtain better cross-lingual transferability.

UR - http://www.scopus.com/inward/record.url?scp=85140387868&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85140387868

T3 - Proceedings of the Annual Meeting of the Association for Computational Linguistics

SP - 6170

EP - 6182

BT - ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers)

A2 - Muresan, Smaranda

A2 - Nakov, Preslav

A2 - Villavicencio, Aline

PB - Association for Computational Linguistics (ACL)

Y2 - 22 May 2022 through 27 May 2022

ER -

Chi Z, Huang S, Dong L, Ma S, Zheng B, Singhal S 等. XLM-E: Cross-lingual Language Model Pre-training via ELECTRA. 在 Muresan S, Nakov P, Villavicencio A, 编辑, ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers). Association for Computational Linguistics (ACL). 2022. 页码 6170-6182. (Proceedings of the Annual Meeting of the Association for Computational Linguistics).

XLM-E: Cross-lingual Language Model Pre-training via ELECTRA

摘要

出版系列

会议

其它文件与链接

指纹

引用此