Unsupervised Large Language Model Alignment for Information Retrieval via Contrastive Feedback

Qian Dong; Yiding Liu; Qingyao Ai; Zhijing Wu; Haitao Li; Yiqun Liu; Shuaiqiang Wang; Dawei Yin; Shaoping Ma

doi:10.1145/3626772.3657689

Unsupervised Large Language Model Alignment for Information Retrieval via Contrastive Feedback

Qian Dong, Yiding Liu, Qingyao Ai^*, Zhijing Wu, Haitao Li, Yiqun Liu, Shuaiqiang Wang, Dawei Yin, Shaoping Ma

^*此作品的通讯作者

计算机学院

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

1 引用（Scopus）

摘要

Large language models (LLMs) have demonstrated remarkable capabilities across various research domains, including the field of Information Retrieval (IR). However, the responses generated by off-the-shelf LLMs tend to be generic, i.e., cannot capture the distinctiveness of each document with similar content. This limits the performance of LLMs in IR because finding and distinguishing relevant documents from substantial similar documents is a typical problem in many IR tasks. To address this issue, we propose an unsupervised alignment method, namely Reinforcement Learning from Contrastive Feedback (RLCF), empowering LLMs to generate both high-quality and context-specific responses. Our approach constructs unsupervised contrastive feedback signals based on similar document groups, and adopts a reward function, named group-wise reciprocal rank, to optimize LLMs. We conduct extensive experiments to evaluate the effectiveness of RLCF.

源语言	英语
主期刊名	SIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval
出版商	Association for Computing Machinery, Inc
页	48-58
页数	11
ISBN（电子版）	9798400704314
DOI	https://doi.org/10.1145/3626772.3657689
出版状态	已出版 - 10 7月 2024
活动	47th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2024 - Washington, 美国期限: 14 7月 2024 → 18 7月 2024

出版系列

姓名	SIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval

会议

会议	47th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2024
国家/地区	美国
市	Washington
时期	14/07/24 → 18/07/24

访问文件

10.1145/3626772.3657689

其它文件与链接

链接到 Scopus 的出版物

引用此

Dong, Q., Liu, Y., Ai, Q., Wu, Z., Li, H., Liu, Y., Wang, S., Yin, D., & Ma, S. (2024). Unsupervised Large Language Model Alignment for Information Retrieval via Contrastive Feedback. 在 SIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (页码 48-58). (SIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval). Association for Computing Machinery, Inc. https://doi.org/10.1145/3626772.3657689

Dong, Qian ; Liu, Yiding ; Ai, Qingyao 等. / Unsupervised Large Language Model Alignment for Information Retrieval via Contrastive Feedback. SIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, Inc, 2024. 页码 48-58 (SIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval).

@inproceedings{c2748b93d4a443eeb8fd5a4cc18574f3,

title = "Unsupervised Large Language Model Alignment for Information Retrieval via Contrastive Feedback",

abstract = "Large language models (LLMs) have demonstrated remarkable capabilities across various research domains, including the field of Information Retrieval (IR). However, the responses generated by off-the-shelf LLMs tend to be generic, i.e., cannot capture the distinctiveness of each document with similar content. This limits the performance of LLMs in IR because finding and distinguishing relevant documents from substantial similar documents is a typical problem in many IR tasks. To address this issue, we propose an unsupervised alignment method, namely Reinforcement Learning from Contrastive Feedback (RLCF), empowering LLMs to generate both high-quality and context-specific responses. Our approach constructs unsupervised contrastive feedback signals based on similar document groups, and adopts a reward function, named group-wise reciprocal rank, to optimize LLMs. We conduct extensive experiments to evaluate the effectiveness of RLCF.",

keywords = "alignment, information retrieval, large language models",

author = "Qian Dong and Yiding Liu and Qingyao Ai and Zhijing Wu and Haitao Li and Yiqun Liu and Shuaiqiang Wang and Dawei Yin and Shaoping Ma",

note = "Publisher Copyright: {\textcopyright} 2024 Owner/Author.; 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2024 ; Conference date: 14-07-2024 Through 18-07-2024",

year = "2024",

month = jul,

day = "10",

doi = "10.1145/3626772.3657689",

language = "English",

series = "SIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval",

publisher = "Association for Computing Machinery, Inc",

pages = "48--58",

booktitle = "SIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval",

}

Dong, Q, Liu, Y, Ai, Q, Wu, Z, Li, H, Liu, Y, Wang, S, Yin, D & Ma, S 2024, Unsupervised Large Language Model Alignment for Information Retrieval via Contrastive Feedback. 在 SIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, Association for Computing Machinery, Inc, 页码 48-58, 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2024, Washington, 美国, 14/07/24. https://doi.org/10.1145/3626772.3657689

Unsupervised Large Language Model Alignment for Information Retrieval via Contrastive Feedback. / Dong, Qian; Liu, Yiding; Ai, Qingyao 等.
SIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, Inc, 2024. 页码 48-58 (SIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - Unsupervised Large Language Model Alignment for Information Retrieval via Contrastive Feedback

AU - Dong, Qian

AU - Liu, Yiding

AU - Ai, Qingyao

AU - Wu, Zhijing

AU - Li, Haitao

AU - Liu, Yiqun

AU - Wang, Shuaiqiang

AU - Yin, Dawei

AU - Ma, Shaoping

PY - 2024/7/10

Y1 - 2024/7/10

N2 - Large language models (LLMs) have demonstrated remarkable capabilities across various research domains, including the field of Information Retrieval (IR). However, the responses generated by off-the-shelf LLMs tend to be generic, i.e., cannot capture the distinctiveness of each document with similar content. This limits the performance of LLMs in IR because finding and distinguishing relevant documents from substantial similar documents is a typical problem in many IR tasks. To address this issue, we propose an unsupervised alignment method, namely Reinforcement Learning from Contrastive Feedback (RLCF), empowering LLMs to generate both high-quality and context-specific responses. Our approach constructs unsupervised contrastive feedback signals based on similar document groups, and adopts a reward function, named group-wise reciprocal rank, to optimize LLMs. We conduct extensive experiments to evaluate the effectiveness of RLCF.

AB - Large language models (LLMs) have demonstrated remarkable capabilities across various research domains, including the field of Information Retrieval (IR). However, the responses generated by off-the-shelf LLMs tend to be generic, i.e., cannot capture the distinctiveness of each document with similar content. This limits the performance of LLMs in IR because finding and distinguishing relevant documents from substantial similar documents is a typical problem in many IR tasks. To address this issue, we propose an unsupervised alignment method, namely Reinforcement Learning from Contrastive Feedback (RLCF), empowering LLMs to generate both high-quality and context-specific responses. Our approach constructs unsupervised contrastive feedback signals based on similar document groups, and adopts a reward function, named group-wise reciprocal rank, to optimize LLMs. We conduct extensive experiments to evaluate the effectiveness of RLCF.

KW - alignment

KW - information retrieval

KW - large language models

UR - http://www.scopus.com/inward/record.url?scp=85200566667&partnerID=8YFLogxK

U2 - 10.1145/3626772.3657689

DO - 10.1145/3626772.3657689

M3 - Conference contribution

AN - SCOPUS:85200566667

T3 - SIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval

SP - 48

EP - 58

BT - SIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval

PB - Association for Computing Machinery, Inc

T2 - 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2024

Y2 - 14 July 2024 through 18 July 2024

ER -

Dong Q, Liu Y, Ai Q, Wu Z, Li H, Liu Y 等. Unsupervised Large Language Model Alignment for Information Retrieval via Contrastive Feedback. 在 SIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, Inc. 2024. 页码 48-58. (SIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval). doi: 10.1145/3626772.3657689

Unsupervised Large Language Model Alignment for Information Retrieval via Contrastive Feedback

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此