Cross-lingual natural language generation via pre-training

Zewen Chi; Li Dong; Furu Wei; Wenhui Wang; Xian Ling Mao; Heyan Huang

Cross-lingual natural language generation via pre-training

Zewen Chi^*, Li Dong, Furu Wei, Wenhui Wang, Xian Ling Mao, Heyan Huang

^*此作品的通讯作者

计算机学院

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

84 引用（Scopus）

摘要

In this work we focus on transferring supervision signals of natural language generation (NLG) tasks between multiple languages. We propose to pretrain the encoder and the decoder of a sequence-to-sequence model under both monolingual and cross-lingual settings. The pre-training objective encourages the model to represent different languages in the shared space, so that we can conduct zero-shot cross-lingual transfer. After the pre-training procedure, we use monolingual data to fine-tune the pre-trained model on downstream NLG tasks. Then the sequence-to-sequence model trained in a single language can be directly evaluated beyond that language (i.e., accepting multi-lingual input and producing multi-lingual output). Experimental results on question generation and abstractive summarization show that our model outperforms the machine-translation-based pipeline methods for zero-shot cross-lingual generation. Moreover, cross-lingual transfer improves NLG performance of low-resource languages by leveraging rich-resource language data. Our implementation and data are available at https://github.com/CZWin32768/xnlg.

源语言	英语
主期刊名	AAAI 2020 - 34th AAAI Conference on Artificial Intelligence
出版商	AAAI press
页	7570-7577
页数	8
ISBN（电子版）	9781577358350
出版状态	已出版 - 2020
活动	34th AAAI Conference on Artificial Intelligence, AAAI 2020 - New York, 美国期限: 7 2月 2020 → 12 2月 2020

出版系列

姓名	AAAI 2020 - 34th AAAI Conference on Artificial Intelligence

会议

会议	34th AAAI Conference on Artificial Intelligence, AAAI 2020
国家/地区	美国
市	New York
时期	7/02/20 → 12/02/20

其它文件与链接

链接到 Scopus 的出版物

引用此

Chi, Z., Dong, L., Wei, F., Wang, W., Mao, X. L., & Huang, H. (2020). Cross-lingual natural language generation via pre-training. 在 AAAI 2020 - 34th AAAI Conference on Artificial Intelligence (页码 7570-7577). (AAAI 2020 - 34th AAAI Conference on Artificial Intelligence). AAAI press.

@inproceedings{ae240ee1ba6b41dd8f89b513f3120604,

title = "Cross-lingual natural language generation via pre-training",

abstract = "In this work we focus on transferring supervision signals of natural language generation (NLG) tasks between multiple languages. We propose to pretrain the encoder and the decoder of a sequence-to-sequence model under both monolingual and cross-lingual settings. The pre-training objective encourages the model to represent different languages in the shared space, so that we can conduct zero-shot cross-lingual transfer. After the pre-training procedure, we use monolingual data to fine-tune the pre-trained model on downstream NLG tasks. Then the sequence-to-sequence model trained in a single language can be directly evaluated beyond that language (i.e., accepting multi-lingual input and producing multi-lingual output). Experimental results on question generation and abstractive summarization show that our model outperforms the machine-translation-based pipeline methods for zero-shot cross-lingual generation. Moreover, cross-lingual transfer improves NLG performance of low-resource languages by leveraging rich-resource language data. Our implementation and data are available at https://github.com/CZWin32768/xnlg.",

author = "Zewen Chi and Li Dong and Furu Wei and Wenhui Wang and Mao, {Xian Ling} and Heyan Huang",

note = "Publisher Copyright: Copyright {\textcopyright} 2020, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.; 34th AAAI Conference on Artificial Intelligence, AAAI 2020 ; Conference date: 07-02-2020 Through 12-02-2020",

year = "2020",

language = "English",

series = "AAAI 2020 - 34th AAAI Conference on Artificial Intelligence",

publisher = "AAAI press",

pages = "7570--7577",

booktitle = "AAAI 2020 - 34th AAAI Conference on Artificial Intelligence",

}

TY - GEN

T1 - Cross-lingual natural language generation via pre-training

AU - Chi, Zewen

AU - Dong, Li

AU - Wei, Furu

AU - Wang, Wenhui

AU - Mao, Xian Ling

AU - Huang, Heyan

PY - 2020

Y1 - 2020

N2 - In this work we focus on transferring supervision signals of natural language generation (NLG) tasks between multiple languages. We propose to pretrain the encoder and the decoder of a sequence-to-sequence model under both monolingual and cross-lingual settings. The pre-training objective encourages the model to represent different languages in the shared space, so that we can conduct zero-shot cross-lingual transfer. After the pre-training procedure, we use monolingual data to fine-tune the pre-trained model on downstream NLG tasks. Then the sequence-to-sequence model trained in a single language can be directly evaluated beyond that language (i.e., accepting multi-lingual input and producing multi-lingual output). Experimental results on question generation and abstractive summarization show that our model outperforms the machine-translation-based pipeline methods for zero-shot cross-lingual generation. Moreover, cross-lingual transfer improves NLG performance of low-resource languages by leveraging rich-resource language data. Our implementation and data are available at https://github.com/CZWin32768/xnlg.

AB - In this work we focus on transferring supervision signals of natural language generation (NLG) tasks between multiple languages. We propose to pretrain the encoder and the decoder of a sequence-to-sequence model under both monolingual and cross-lingual settings. The pre-training objective encourages the model to represent different languages in the shared space, so that we can conduct zero-shot cross-lingual transfer. After the pre-training procedure, we use monolingual data to fine-tune the pre-trained model on downstream NLG tasks. Then the sequence-to-sequence model trained in a single language can be directly evaluated beyond that language (i.e., accepting multi-lingual input and producing multi-lingual output). Experimental results on question generation and abstractive summarization show that our model outperforms the machine-translation-based pipeline methods for zero-shot cross-lingual generation. Moreover, cross-lingual transfer improves NLG performance of low-resource languages by leveraging rich-resource language data. Our implementation and data are available at https://github.com/CZWin32768/xnlg.

UR - http://www.scopus.com/inward/record.url?scp=85106580912&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85106580912

T3 - AAAI 2020 - 34th AAAI Conference on Artificial Intelligence

SP - 7570

EP - 7577

BT - AAAI 2020 - 34th AAAI Conference on Artificial Intelligence

PB - AAAI press

T2 - 34th AAAI Conference on Artificial Intelligence, AAAI 2020

Y2 - 7 February 2020 through 12 February 2020

ER -

Cross-lingual natural language generation via pre-training

摘要

出版系列

会议

其它文件与链接

指纹

引用此