PSP: Pre-trained Soft Prompts for Few-Shot Abstractive Summarization

Xiaochen Liu; Yang Gao; Yu Bai; Jiawei Li; Yinan Hu; Heyan Huang; Boxing Chen

PSP: Pre-trained Soft Prompts for Few-Shot Abstractive Summarization

Xiaochen Liu, Yang Gao^*, Yu Bai, Jiawei Li, Yinan Hu, Heyan Huang, Boxing Chen

^*此作品的通讯作者

计算机学院

Beijing Institute of Technology

科研成果: 期刊稿件 › 会议文章 › 同行评审

9 引用（Scopus）

摘要

Few-shot abstractive summarization has become a challenging task in natural language generation. To support it, we developed a novel soft prompts architecture coupled with a prompt pre-training plus prompt fine-tuning paradigm, which is effective and tunes only extremely light parameters. To meet the structure of the generation models, the soft prompts comprise continuous input embeddings across an encoder and a decoder. Importantly, a new inner-prompt placed in the text is introduced to capture document-level information. The aim is to devote attention to understanding the document that better prompts the model to generate document-related content. In the training process, the prompt pre-training with self-supervised pseudo-data firstly teaches the model basic summarizing capability. Then, with few-shot examples, only the designed lightweight soft prompts are fine-tuned. Experimental results on the CNN/DailyMail and XSum datasets show that our method, with only 0.1% of the parameters, outperforms full-model tuning where all model parameters are tuned. It also surpasses Prompt Tuning by a large margin and delivers competitive results against Prefix-Tuning with 3% of the parameters.

源语言	英语
页（从-至）	6355-6368
页数	14
期刊	Proceedings - International Conference on Computational Linguistics, COLING
卷	29
期	1
出版状态	已出版 - 2022
活动	29th International Conference on Computational Linguistics, COLING 2022 - Gyeongju, 韩国期限: 12 10月 2022 → 17 10月 2022

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{d7aa3c954b684110b2bfbe6afc12aaa8,

title = "PSP: Pre-trained Soft Prompts for Few-Shot Abstractive Summarization",

abstract = "Few-shot abstractive summarization has become a challenging task in natural language generation. To support it, we developed a novel soft prompts architecture coupled with a prompt pre-training plus prompt fine-tuning paradigm, which is effective and tunes only extremely light parameters. To meet the structure of the generation models, the soft prompts comprise continuous input embeddings across an encoder and a decoder. Importantly, a new inner-prompt placed in the text is introduced to capture document-level information. The aim is to devote attention to understanding the document that better prompts the model to generate document-related content. In the training process, the prompt pre-training with self-supervised pseudo-data firstly teaches the model basic summarizing capability. Then, with few-shot examples, only the designed lightweight soft prompts are fine-tuned. Experimental results on the CNN/DailyMail and XSum datasets show that our method, with only 0.1% of the parameters, outperforms full-model tuning where all model parameters are tuned. It also surpasses Prompt Tuning by a large margin and delivers competitive results against Prefix-Tuning with 3% of the parameters.",

author = "Xiaochen Liu and Yang Gao and Yu Bai and Jiawei Li and Yinan Hu and Heyan Huang and Boxing Chen",

note = "Publisher Copyright: {\textcopyright} 2022 Proceedings - International Conference on Computational Linguistics, COLING. All rights reserved.; 29th International Conference on Computational Linguistics, COLING 2022 ; Conference date: 12-10-2022 Through 17-10-2022",

year = "2022",

language = "English",

volume = "29",

pages = "6355--6368",

journal = "Proceedings - International Conference on Computational Linguistics, COLING",

issn = "2951-2093",

publisher = "Association for Computational Linguistics (ACL)",

number = "1",

}

TY - JOUR

T1 - PSP

T2 - 29th International Conference on Computational Linguistics, COLING 2022

AU - Liu, Xiaochen

AU - Gao, Yang

AU - Bai, Yu

AU - Li, Jiawei

AU - Hu, Yinan

AU - Huang, Heyan

AU - Chen, Boxing

PY - 2022

Y1 - 2022

N2 - Few-shot abstractive summarization has become a challenging task in natural language generation. To support it, we developed a novel soft prompts architecture coupled with a prompt pre-training plus prompt fine-tuning paradigm, which is effective and tunes only extremely light parameters. To meet the structure of the generation models, the soft prompts comprise continuous input embeddings across an encoder and a decoder. Importantly, a new inner-prompt placed in the text is introduced to capture document-level information. The aim is to devote attention to understanding the document that better prompts the model to generate document-related content. In the training process, the prompt pre-training with self-supervised pseudo-data firstly teaches the model basic summarizing capability. Then, with few-shot examples, only the designed lightweight soft prompts are fine-tuned. Experimental results on the CNN/DailyMail and XSum datasets show that our method, with only 0.1% of the parameters, outperforms full-model tuning where all model parameters are tuned. It also surpasses Prompt Tuning by a large margin and delivers competitive results against Prefix-Tuning with 3% of the parameters.

AB - Few-shot abstractive summarization has become a challenging task in natural language generation. To support it, we developed a novel soft prompts architecture coupled with a prompt pre-training plus prompt fine-tuning paradigm, which is effective and tunes only extremely light parameters. To meet the structure of the generation models, the soft prompts comprise continuous input embeddings across an encoder and a decoder. Importantly, a new inner-prompt placed in the text is introduced to capture document-level information. The aim is to devote attention to understanding the document that better prompts the model to generate document-related content. In the training process, the prompt pre-training with self-supervised pseudo-data firstly teaches the model basic summarizing capability. Then, with few-shot examples, only the designed lightweight soft prompts are fine-tuned. Experimental results on the CNN/DailyMail and XSum datasets show that our method, with only 0.1% of the parameters, outperforms full-model tuning where all model parameters are tuned. It also surpasses Prompt Tuning by a large margin and delivers competitive results against Prefix-Tuning with 3% of the parameters.

UR - http://www.scopus.com/inward/record.url?scp=85140708674&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:85140708674

SN - 2951-2093

VL - 29

SP - 6355

EP - 6368

JO - Proceedings - International Conference on Computational Linguistics, COLING

JF - Proceedings - International Conference on Computational Linguistics, COLING

IS - 1

Y2 - 12 October 2022 through 17 October 2022

ER -

PSP: Pre-trained Soft Prompts for Few-Shot Abstractive Summarization

摘要

其它文件与链接

指纹

引用此