In-Context Learning Reward Guided Decoding for Controlled Text Generation

Xinyi Zhu; Yanru Zhou; Dandan Song; Ziyi Yang

doi:10.1109/ICSP62122.2024.10743861

In-Context Learning Reward Guided Decoding for Controlled Text Generation

Xinyi Zhu, Yanru Zhou, Dandan Song^*, Ziyi Yang

^*Corresponding author for this work

Beijing Institute of Technology

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

Abstract

While large language models have demonstrated remarkable text generation capabilities, they often generate text with adverse or undesired attributes. Common approaches to control text generation involve refining models on data with desired properties or guiding language models decoding using an auxiliary model. However, these methods require additional training and extensive attribute-specific data. To further mitigate the training costs, we propose In-context learning Reward Guided Decoding (IRGD), a weighted decoding method that exploits the in-context learning ability of language models as an alternative to additional model fine-tuning. Specifically, IRGD utilizes ICL outputs to score the alignment reward between sequences and target attributes, subsequently modifying the sampling probabilities to favor tokens with higher reward scores. By applying ICL, IRGD adapts to different tasks by simply adjusting task descriptions and demonstration rather than fine-tuning the model. Through experiments on detoxification and sentiment control, we demonstrate the advantages of IRGD as a plug-and-play and fine-tuning-free decoding method that effectively balance attribute alignment and text quality.

Original language	English
Title of host publication	2024 9th International Conference on Intelligent Computing and Signal Processing, ICSP 2024
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	1116-1120
Number of pages	5
ISBN (Electronic)	9798350376548
DOIs	https://doi.org/10.1109/ICSP62122.2024.10743861
Publication status	Published - 2024
Event	9th International Conference on Intelligent Computing and Signal Processing, ICSP 2024 - Hybrid, Xi'an, China Duration: 19 Apr 2024 → 21 Apr 2024

Publication series

Name	2024 9th International Conference on Intelligent Computing and Signal Processing, ICSP 2024

Conference

Conference	9th International Conference on Intelligent Computing and Signal Processing, ICSP 2024
Country/Territory	China
City	Hybrid, Xi'an
Period	19/04/24 → 21/04/24

Keywords

controlled text generation
in-context learning
weighted decoding

Access to Document

10.1109/ICSP62122.2024.10743861

Cite this

Zhu, X., Zhou, Y., Song, D., & Yang, Z. (2024). In-Context Learning Reward Guided Decoding for Controlled Text Generation. In 2024 9th International Conference on Intelligent Computing and Signal Processing, ICSP 2024 (pp. 1116-1120). (2024 9th International Conference on Intelligent Computing and Signal Processing, ICSP 2024). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICSP62122.2024.10743861

Zhu, Xinyi ; Zhou, Yanru ; Song, Dandan et al. / In-Context Learning Reward Guided Decoding for Controlled Text Generation. 2024 9th International Conference on Intelligent Computing and Signal Processing, ICSP 2024. Institute of Electrical and Electronics Engineers Inc., 2024. pp. 1116-1120 (2024 9th International Conference on Intelligent Computing and Signal Processing, ICSP 2024).

@inproceedings{7f8e280a72f64eccaab92b5d14808b47,

title = "In-Context Learning Reward Guided Decoding for Controlled Text Generation",

abstract = "While large language models have demonstrated remarkable text generation capabilities, they often generate text with adverse or undesired attributes. Common approaches to control text generation involve refining models on data with desired properties or guiding language models decoding using an auxiliary model. However, these methods require additional training and extensive attribute-specific data. To further mitigate the training costs, we propose In-context learning Reward Guided Decoding (IRGD), a weighted decoding method that exploits the in-context learning ability of language models as an alternative to additional model fine-tuning. Specifically, IRGD utilizes ICL outputs to score the alignment reward between sequences and target attributes, subsequently modifying the sampling probabilities to favor tokens with higher reward scores. By applying ICL, IRGD adapts to different tasks by simply adjusting task descriptions and demonstration rather than fine-tuning the model. Through experiments on detoxification and sentiment control, we demonstrate the advantages of IRGD as a plug-and-play and fine-tuning-free decoding method that effectively balance attribute alignment and text quality.",

keywords = "controlled text generation, in-context learning, weighted decoding",

author = "Xinyi Zhu and Yanru Zhou and Dandan Song and Ziyi Yang",

note = "Publisher Copyright: {\textcopyright} 2024 IEEE.; 9th International Conference on Intelligent Computing and Signal Processing, ICSP 2024 ; Conference date: 19-04-2024 Through 21-04-2024",

year = "2024",

doi = "10.1109/ICSP62122.2024.10743861",

language = "English",

series = "2024 9th International Conference on Intelligent Computing and Signal Processing, ICSP 2024",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "1116--1120",

booktitle = "2024 9th International Conference on Intelligent Computing and Signal Processing, ICSP 2024",

address = "United States",

}

Zhu, X, Zhou, Y, Song, D & Yang, Z 2024, In-Context Learning Reward Guided Decoding for Controlled Text Generation. in 2024 9th International Conference on Intelligent Computing and Signal Processing, ICSP 2024. 2024 9th International Conference on Intelligent Computing and Signal Processing, ICSP 2024, Institute of Electrical and Electronics Engineers Inc., pp. 1116-1120, 9th International Conference on Intelligent Computing and Signal Processing, ICSP 2024, Hybrid, Xi'an, China, 19/04/24. https://doi.org/10.1109/ICSP62122.2024.10743861

In-Context Learning Reward Guided Decoding for Controlled Text Generation. / Zhu, Xinyi; Zhou, Yanru; Song, Dandan et al.
2024 9th International Conference on Intelligent Computing and Signal Processing, ICSP 2024. Institute of Electrical and Electronics Engineers Inc., 2024. p. 1116-1120 (2024 9th International Conference on Intelligent Computing and Signal Processing, ICSP 2024).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - In-Context Learning Reward Guided Decoding for Controlled Text Generation

AU - Zhu, Xinyi

AU - Zhou, Yanru

AU - Song, Dandan

AU - Yang, Ziyi

PY - 2024

Y1 - 2024

N2 - While large language models have demonstrated remarkable text generation capabilities, they often generate text with adverse or undesired attributes. Common approaches to control text generation involve refining models on data with desired properties or guiding language models decoding using an auxiliary model. However, these methods require additional training and extensive attribute-specific data. To further mitigate the training costs, we propose In-context learning Reward Guided Decoding (IRGD), a weighted decoding method that exploits the in-context learning ability of language models as an alternative to additional model fine-tuning. Specifically, IRGD utilizes ICL outputs to score the alignment reward between sequences and target attributes, subsequently modifying the sampling probabilities to favor tokens with higher reward scores. By applying ICL, IRGD adapts to different tasks by simply adjusting task descriptions and demonstration rather than fine-tuning the model. Through experiments on detoxification and sentiment control, we demonstrate the advantages of IRGD as a plug-and-play and fine-tuning-free decoding method that effectively balance attribute alignment and text quality.

AB - While large language models have demonstrated remarkable text generation capabilities, they often generate text with adverse or undesired attributes. Common approaches to control text generation involve refining models on data with desired properties or guiding language models decoding using an auxiliary model. However, these methods require additional training and extensive attribute-specific data. To further mitigate the training costs, we propose In-context learning Reward Guided Decoding (IRGD), a weighted decoding method that exploits the in-context learning ability of language models as an alternative to additional model fine-tuning. Specifically, IRGD utilizes ICL outputs to score the alignment reward between sequences and target attributes, subsequently modifying the sampling probabilities to favor tokens with higher reward scores. By applying ICL, IRGD adapts to different tasks by simply adjusting task descriptions and demonstration rather than fine-tuning the model. Through experiments on detoxification and sentiment control, we demonstrate the advantages of IRGD as a plug-and-play and fine-tuning-free decoding method that effectively balance attribute alignment and text quality.

KW - controlled text generation

KW - in-context learning

KW - weighted decoding

UR - http://www.scopus.com/inward/record.url?scp=85211499704&partnerID=8YFLogxK

U2 - 10.1109/ICSP62122.2024.10743861

DO - 10.1109/ICSP62122.2024.10743861

M3 - Conference contribution

AN - SCOPUS:85211499704

T3 - 2024 9th International Conference on Intelligent Computing and Signal Processing, ICSP 2024

SP - 1116

EP - 1120

BT - 2024 9th International Conference on Intelligent Computing and Signal Processing, ICSP 2024

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 9th International Conference on Intelligent Computing and Signal Processing, ICSP 2024

Y2 - 19 April 2024 through 21 April 2024

ER -

Zhu X, Zhou Y, Song D , Yang Z. In-Context Learning Reward Guided Decoding for Controlled Text Generation. In 2024 9th International Conference on Intelligent Computing and Signal Processing, ICSP 2024. Institute of Electrical and Electronics Engineers Inc. 2024. p. 1116-1120. (2024 9th International Conference on Intelligent Computing and Signal Processing, ICSP 2024). doi: 10.1109/ICSP62122.2024.10743861

In-Context Learning Reward Guided Decoding for Controlled Text Generation

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this