In-Context Learning Reward Guided Decoding for Controlled Text Generation

Xinyi Zhu, Yanru Zhou, Dandan Song*, Ziyi Yang

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

While large language models have demonstrated remarkable text generation capabilities, they often generate text with adverse or undesired attributes. Common approaches to control text generation involve refining models on data with desired properties or guiding language models decoding using an auxiliary model. However, these methods require additional training and extensive attribute-specific data. To further mitigate the training costs, we propose In-context learning Reward Guided Decoding (IRGD), a weighted decoding method that exploits the in-context learning ability of language models as an alternative to additional model fine-tuning. Specifically, IRGD utilizes ICL outputs to score the alignment reward between sequences and target attributes, subsequently modifying the sampling probabilities to favor tokens with higher reward scores. By applying ICL, IRGD adapts to different tasks by simply adjusting task descriptions and demonstration rather than fine-tuning the model. Through experiments on detoxification and sentiment control, we demonstrate the advantages of IRGD as a plug-and-play and fine-tuning-free decoding method that effectively balance attribute alignment and text quality.

Original languageEnglish
Title of host publication2024 9th International Conference on Intelligent Computing and Signal Processing, ICSP 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1116-1120
Number of pages5
ISBN (Electronic)9798350376548
DOIs
Publication statusPublished - 2024
Event9th International Conference on Intelligent Computing and Signal Processing, ICSP 2024 - Hybrid, Xi'an, China
Duration: 19 Apr 202421 Apr 2024

Publication series

Name2024 9th International Conference on Intelligent Computing and Signal Processing, ICSP 2024

Conference

Conference9th International Conference on Intelligent Computing and Signal Processing, ICSP 2024
Country/TerritoryChina
CityHybrid, Xi'an
Period19/04/2421/04/24

Keywords

  • controlled text generation
  • in-context learning
  • weighted decoding

Fingerprint

Dive into the research topics of 'In-Context Learning Reward Guided Decoding for Controlled Text Generation'. Together they form a unique fingerprint.

Cite this