PSP: Pre-trained Soft Prompts for Few-Shot Abstractive Summarization

Xiaochen Liu; Yang Gao; Yu Bai; Jiawei Li; Yinan Hu; Heyan Huang; Boxing Chen

PSP: Pre-trained Soft Prompts for Few-Shot Abstractive Summarization

Xiaochen Liu, Yang Gao^*, Yu Bai, Jiawei Li, Yinan Hu, Heyan Huang, Boxing Chen

^*Corresponding author for this work

School of Computer Science and Technology

Beijing Institute of Technology

Research output: Contribution to journal › Conference article › peer-review

8 Citations (Scopus)

Abstract

Few-shot abstractive summarization has become a challenging task in natural language generation. To support it, we developed a novel soft prompts architecture coupled with a prompt pre-training plus prompt fine-tuning paradigm, which is effective and tunes only extremely light parameters. To meet the structure of the generation models, the soft prompts comprise continuous input embeddings across an encoder and a decoder. Importantly, a new inner-prompt placed in the text is introduced to capture document-level information. The aim is to devote attention to understanding the document that better prompts the model to generate document-related content. In the training process, the prompt pre-training with self-supervised pseudo-data firstly teaches the model basic summarizing capability. Then, with few-shot examples, only the designed lightweight soft prompts are fine-tuned. Experimental results on the CNN/DailyMail and XSum datasets show that our method, with only 0.1% of the parameters, outperforms full-model tuning where all model parameters are tuned. It also surpasses Prompt Tuning by a large margin and delivers competitive results against Prefix-Tuning with 3% of the parameters.

Original language	English
Pages (from-to)	6355-6368
Number of pages	14
Journal	Proceedings - International Conference on Computational Linguistics, COLING
Volume	29
Issue number	1
Publication status	Published - 2022
Event	29th International Conference on Computational Linguistics, COLING 2022 - Gyeongju, Korea, Republic of Duration: 12 Oct 2022 → 17 Oct 2022

Cite this

@article{d7aa3c954b684110b2bfbe6afc12aaa8,

title = "PSP: Pre-trained Soft Prompts for Few-Shot Abstractive Summarization",

abstract = "Few-shot abstractive summarization has become a challenging task in natural language generation. To support it, we developed a novel soft prompts architecture coupled with a prompt pre-training plus prompt fine-tuning paradigm, which is effective and tunes only extremely light parameters. To meet the structure of the generation models, the soft prompts comprise continuous input embeddings across an encoder and a decoder. Importantly, a new inner-prompt placed in the text is introduced to capture document-level information. The aim is to devote attention to understanding the document that better prompts the model to generate document-related content. In the training process, the prompt pre-training with self-supervised pseudo-data firstly teaches the model basic summarizing capability. Then, with few-shot examples, only the designed lightweight soft prompts are fine-tuned. Experimental results on the CNN/DailyMail and XSum datasets show that our method, with only 0.1% of the parameters, outperforms full-model tuning where all model parameters are tuned. It also surpasses Prompt Tuning by a large margin and delivers competitive results against Prefix-Tuning with 3% of the parameters.",

author = "Xiaochen Liu and Yang Gao and Yu Bai and Jiawei Li and Yinan Hu and Heyan Huang and Boxing Chen",

note = "Publisher Copyright: {\textcopyright} 2022 Proceedings - International Conference on Computational Linguistics, COLING. All rights reserved.; 29th International Conference on Computational Linguistics, COLING 2022 ; Conference date: 12-10-2022 Through 17-10-2022",

year = "2022",

language = "English",

volume = "29",

pages = "6355--6368",

journal = "Proceedings - International Conference on Computational Linguistics, COLING",

issn = "2951-2093",

publisher = "Association for Computational Linguistics (ACL)",

number = "1",

}

TY - JOUR

T1 - PSP

T2 - 29th International Conference on Computational Linguistics, COLING 2022

AU - Liu, Xiaochen

AU - Gao, Yang

AU - Bai, Yu

AU - Li, Jiawei

AU - Hu, Yinan

AU - Huang, Heyan

AU - Chen, Boxing

PY - 2022

Y1 - 2022

N2 - Few-shot abstractive summarization has become a challenging task in natural language generation. To support it, we developed a novel soft prompts architecture coupled with a prompt pre-training plus prompt fine-tuning paradigm, which is effective and tunes only extremely light parameters. To meet the structure of the generation models, the soft prompts comprise continuous input embeddings across an encoder and a decoder. Importantly, a new inner-prompt placed in the text is introduced to capture document-level information. The aim is to devote attention to understanding the document that better prompts the model to generate document-related content. In the training process, the prompt pre-training with self-supervised pseudo-data firstly teaches the model basic summarizing capability. Then, with few-shot examples, only the designed lightweight soft prompts are fine-tuned. Experimental results on the CNN/DailyMail and XSum datasets show that our method, with only 0.1% of the parameters, outperforms full-model tuning where all model parameters are tuned. It also surpasses Prompt Tuning by a large margin and delivers competitive results against Prefix-Tuning with 3% of the parameters.

AB - Few-shot abstractive summarization has become a challenging task in natural language generation. To support it, we developed a novel soft prompts architecture coupled with a prompt pre-training plus prompt fine-tuning paradigm, which is effective and tunes only extremely light parameters. To meet the structure of the generation models, the soft prompts comprise continuous input embeddings across an encoder and a decoder. Importantly, a new inner-prompt placed in the text is introduced to capture document-level information. The aim is to devote attention to understanding the document that better prompts the model to generate document-related content. In the training process, the prompt pre-training with self-supervised pseudo-data firstly teaches the model basic summarizing capability. Then, with few-shot examples, only the designed lightweight soft prompts are fine-tuned. Experimental results on the CNN/DailyMail and XSum datasets show that our method, with only 0.1% of the parameters, outperforms full-model tuning where all model parameters are tuned. It also surpasses Prompt Tuning by a large margin and delivers competitive results against Prefix-Tuning with 3% of the parameters.

UR - http://www.scopus.com/inward/record.url?scp=85140708674&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:85140708674

SN - 2951-2093

VL - 29

SP - 6355

EP - 6368

JO - Proceedings - International Conference on Computational Linguistics, COLING

JF - Proceedings - International Conference on Computational Linguistics, COLING

IS - 1

Y2 - 12 October 2022 through 17 October 2022

ER -

PSP: Pre-trained Soft Prompts for Few-Shot Abstractive Summarization

Abstract

Other files and links

Fingerprint

Cite this