Improving pretrained cross-lingual language models via self-labeled word alignment

Zewen Chi; Li Dong; Bo Zheng; Shaohan Huang; Xian Ling Mao; Heyan Huang; Furu Wei

Improving pretrained cross-lingual language models via self-labeled word alignment

Zewen Chi^*, Li Dong, Bo Zheng^*, Shaohan Huang, Xian Ling Mao, Heyan Huang, Furu Wei

^*此作品的通讯作者

计算机学院

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

54 引用（Scopus）

摘要

The cross-lingual language models are typically pretrained with masked language modeling on multilingual text or parallel sentences. In this paper, we introduce denoising word alignment as a new cross-lingual pre-training task. Specifically, the model first self-labels word alignments for parallel sentences. Then we randomly mask tokens in a bitext pair. Given a masked token, the model uses a pointer network to predict the aligned token in the other language. We alternately perform the above two steps in an expectation-maximization manner. Experimental results show that our method improves cross-lingual transferability on various datasets, especially on the token-level tasks, such as question answering, and structured prediction. Moreover, the model can serve as a pretrained word aligner, which achieves reasonably low error rates on the alignment benchmarks. The code and pretrained parameters are available at github.com/CZWin32768/XLM-Align.

源语言	英语
主期刊名	ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference
出版商	Association for Computational Linguistics (ACL)
页	3418-3430
页数	13
ISBN（电子版）	9781954085527
出版状态	已出版 - 2021
活动	Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL-IJCNLP 2021 - Virtual, Online 期限: 1 8月 2021 → 6 8月 2021

出版系列

姓名	ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference

会议

会议	Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL-IJCNLP 2021
市	Virtual, Online
时期	1/08/21 → 6/08/21

其它文件与链接

链接到 Scopus 的出版物

引用此

Chi, Z., Dong, L., Zheng, B., Huang, S., Mao, X. L., Huang, H., & Wei, F. (2021). Improving pretrained cross-lingual language models via self-labeled word alignment. 在 ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference (页码 3418-3430). (ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference). Association for Computational Linguistics (ACL).

Chi, Zewen ; Dong, Li ; Zheng, Bo 等. / Improving pretrained cross-lingual language models via self-labeled word alignment. ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference. Association for Computational Linguistics (ACL), 2021. 页码 3418-3430 (ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference).

@inproceedings{27f52af2f78e4e3b90705baa4888cf6c,

title = "Improving pretrained cross-lingual language models via self-labeled word alignment",

abstract = "The cross-lingual language models are typically pretrained with masked language modeling on multilingual text or parallel sentences. In this paper, we introduce denoising word alignment as a new cross-lingual pre-training task. Specifically, the model first self-labels word alignments for parallel sentences. Then we randomly mask tokens in a bitext pair. Given a masked token, the model uses a pointer network to predict the aligned token in the other language. We alternately perform the above two steps in an expectation-maximization manner. Experimental results show that our method improves cross-lingual transferability on various datasets, especially on the token-level tasks, such as question answering, and structured prediction. Moreover, the model can serve as a pretrained word aligner, which achieves reasonably low error rates on the alignment benchmarks. The code and pretrained parameters are available at github.com/CZWin32768/XLM-Align.",

author = "Zewen Chi and Li Dong and Bo Zheng and Shaohan Huang and Mao, {Xian Ling} and Heyan Huang and Furu Wei",

note = "Publisher Copyright: {\textcopyright} 2021 Association for Computational Linguistics; Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL-IJCNLP 2021 ; Conference date: 01-08-2021 Through 06-08-2021",

year = "2021",

language = "English",

series = "ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference",

publisher = "Association for Computational Linguistics (ACL)",

pages = "3418--3430",

booktitle = "ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference",

address = "United States",

}

Chi, Z, Dong, L, Zheng, B, Huang, S, Mao, XL , Huang, H & Wei, F 2021, Improving pretrained cross-lingual language models via self-labeled word alignment. 在 ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference. ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference, Association for Computational Linguistics (ACL), 页码 3418-3430, Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL-IJCNLP 2021, Virtual, Online, 1/08/21.

Improving pretrained cross-lingual language models via self-labeled word alignment. / Chi, Zewen; Dong, Li; Zheng, Bo 等.
ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference. Association for Computational Linguistics (ACL), 2021. 页码 3418-3430 (ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - Improving pretrained cross-lingual language models via self-labeled word alignment

AU - Chi, Zewen

AU - Dong, Li

AU - Zheng, Bo

AU - Huang, Shaohan

AU - Mao, Xian Ling

AU - Huang, Heyan

AU - Wei, Furu

PY - 2021

Y1 - 2021

N2 - The cross-lingual language models are typically pretrained with masked language modeling on multilingual text or parallel sentences. In this paper, we introduce denoising word alignment as a new cross-lingual pre-training task. Specifically, the model first self-labels word alignments for parallel sentences. Then we randomly mask tokens in a bitext pair. Given a masked token, the model uses a pointer network to predict the aligned token in the other language. We alternately perform the above two steps in an expectation-maximization manner. Experimental results show that our method improves cross-lingual transferability on various datasets, especially on the token-level tasks, such as question answering, and structured prediction. Moreover, the model can serve as a pretrained word aligner, which achieves reasonably low error rates on the alignment benchmarks. The code and pretrained parameters are available at github.com/CZWin32768/XLM-Align.

AB - The cross-lingual language models are typically pretrained with masked language modeling on multilingual text or parallel sentences. In this paper, we introduce denoising word alignment as a new cross-lingual pre-training task. Specifically, the model first self-labels word alignments for parallel sentences. Then we randomly mask tokens in a bitext pair. Given a masked token, the model uses a pointer network to predict the aligned token in the other language. We alternately perform the above two steps in an expectation-maximization manner. Experimental results show that our method improves cross-lingual transferability on various datasets, especially on the token-level tasks, such as question answering, and structured prediction. Moreover, the model can serve as a pretrained word aligner, which achieves reasonably low error rates on the alignment benchmarks. The code and pretrained parameters are available at github.com/CZWin32768/XLM-Align.

UR - http://www.scopus.com/inward/record.url?scp=85116700226&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85116700226

T3 - ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference

SP - 3418

EP - 3430

BT - ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference

PB - Association for Computational Linguistics (ACL)

T2 - Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL-IJCNLP 2021

Y2 - 1 August 2021 through 6 August 2021

ER -

Chi Z, Dong L, Zheng B, Huang S, Mao XL , Huang H 等. Improving pretrained cross-lingual language models via self-labeled word alignment. 在 ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference. Association for Computational Linguistics (ACL). 2021. 页码 3418-3430. (ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference).

Improving pretrained cross-lingual language models via self-labeled word alignment

摘要

出版系列

会议

其它文件与链接

指纹

引用此