跳到主要导航 跳到搜索 跳到主要内容

Steering Large Language Models for Cross-lingual Information Retrieval

  • Ping Guo
  • , Yubing Ren
  • , Yue Hu*
  • , Yanan Cao
  • , Yunpeng Li
  • , Heyan Huang
  • *此作品的通讯作者
  • CAS - Institute of Information Engineering

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

In today's digital age, accessing information across language barriers poses a significant challenge, with conventional search systems often struggling to interpret and retrieve multilingual content accurately. Addressing this issue, our study introduces a novel integration of applying Large Language Models (LLMs) as Cross-lingual Readers in information retrieval systems, specifically targeting the complexities of cross-lingual information retrieval (CLIR). We present an innovative approach: Activation Steered Multilingual Retrieval (ASMR) that employs "steering activations''-a method to adjust and direct the LLM's focus-enhancing its ability to understand user queries and generate accurate, language-coherent responses. ASMR adeptly combines a Multilingual Dense Passage Retrieval (mDPR) system with an LLM, overcoming the limitations of traditional search engines in handling diverse linguistic inputs. This approach is particularly effective in managing the nuances and intricacies inherent in various languages. Rigorous testing on established benchmarks such as XOR-TyDi QA, and MKQA demonstrates that ASMR not only meets but surpasses existing standards in CLIR, achieving state-of-the-art performance. The results of our research hold significant implications for understanding the inherent features of how LLMs understand and generate natural languages, offering an attempt towards more inclusive, effective, and linguistically diverse information access on a global scale.

源语言英语
主期刊名SIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval
出版商Association for Computing Machinery, Inc
585-596
页数12
ISBN(电子版)9798400704314
DOI
出版状态已出版 - 11 7月 2024
活动47th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2024 - Washington, 美国
期限: 14 7月 202418 7月 2024

出版系列

姓名SIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval

会议

会议47th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2024
国家/地区美国
Washington
时期14/07/2418/07/24

指纹

探究 'Steering Large Language Models for Cross-lingual Information Retrieval' 的科研主题。它们共同构成独一无二的指纹。

引用此