Unsupervised Large Language Model Alignment for Information Retrieval via Contrastive Feedback

Qian Dong, Yiding Liu, Qingyao Ai*, Zhijing Wu, Haitao Li, Yiqun Liu, Shuaiqiang Wang, Dawei Yin, Shaoping Ma

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

1 引用 (Scopus)

摘要

Large language models (LLMs) have demonstrated remarkable capabilities across various research domains, including the field of Information Retrieval (IR). However, the responses generated by off-the-shelf LLMs tend to be generic, i.e., cannot capture the distinctiveness of each document with similar content. This limits the performance of LLMs in IR because finding and distinguishing relevant documents from substantial similar documents is a typical problem in many IR tasks. To address this issue, we propose an unsupervised alignment method, namely Reinforcement Learning from Contrastive Feedback (RLCF), empowering LLMs to generate both high-quality and context-specific responses. Our approach constructs unsupervised contrastive feedback signals based on similar document groups, and adopts a reward function, named group-wise reciprocal rank, to optimize LLMs. We conduct extensive experiments to evaluate the effectiveness of RLCF.

源语言英语
主期刊名SIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval
出版商Association for Computing Machinery, Inc
48-58
页数11
ISBN(电子版)9798400704314
DOI
出版状态已出版 - 10 7月 2024
活动47th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2024 - Washington, 美国
期限: 14 7月 202418 7月 2024

出版系列

姓名SIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval

会议

会议47th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2024
国家/地区美国
Washington
时期14/07/2418/07/24

指纹

探究 'Unsupervised Large Language Model Alignment for Information Retrieval via Contrastive Feedback' 的科研主题。它们共同构成独一无二的指纹。

引用此

Dong, Q., Liu, Y., Ai, Q., Wu, Z., Li, H., Liu, Y., Wang, S., Yin, D., & Ma, S. (2024). Unsupervised Large Language Model Alignment for Information Retrieval via Contrastive Feedback. 在 SIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (页码 48-58). (SIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval). Association for Computing Machinery, Inc. https://doi.org/10.1145/3626772.3657689