In-Context Learning Reward Guided Decoding for Controlled Text Generation

Xinyi Zhu, Yanru Zhou, Dandan Song*, Ziyi Yang

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

While large language models have demonstrated remarkable text generation capabilities, they often generate text with adverse or undesired attributes. Common approaches to control text generation involve refining models on data with desired properties or guiding language models decoding using an auxiliary model. However, these methods require additional training and extensive attribute-specific data. To further mitigate the training costs, we propose In-context learning Reward Guided Decoding (IRGD), a weighted decoding method that exploits the in-context learning ability of language models as an alternative to additional model fine-tuning. Specifically, IRGD utilizes ICL outputs to score the alignment reward between sequences and target attributes, subsequently modifying the sampling probabilities to favor tokens with higher reward scores. By applying ICL, IRGD adapts to different tasks by simply adjusting task descriptions and demonstration rather than fine-tuning the model. Through experiments on detoxification and sentiment control, we demonstrate the advantages of IRGD as a plug-and-play and fine-tuning-free decoding method that effectively balance attribute alignment and text quality.

源语言英语
主期刊名2024 9th International Conference on Intelligent Computing and Signal Processing, ICSP 2024
出版商Institute of Electrical and Electronics Engineers Inc.
1116-1120
页数5
ISBN(电子版)9798350376548
DOI
出版状态已出版 - 2024
活动9th International Conference on Intelligent Computing and Signal Processing, ICSP 2024 - Hybrid, Xi'an, 中国
期限: 19 4月 202421 4月 2024

出版系列

姓名2024 9th International Conference on Intelligent Computing and Signal Processing, ICSP 2024

会议

会议9th International Conference on Intelligent Computing and Signal Processing, ICSP 2024
国家/地区中国
Hybrid, Xi'an
时期19/04/2421/04/24

指纹

探究 'In-Context Learning Reward Guided Decoding for Controlled Text Generation' 的科研主题。它们共同构成独一无二的指纹。

引用此