Leveraging Document-Level and Query-Level Passage Cumulative Gain for Document Ranking

Zhi Jing Wu, Yi Qun Liu*, Jia Xin Mao, Min Zhang, Shao Ping Ma

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

3 引用 (Scopus)

摘要

Document ranking is one of the most studied but challenging problems in information retrieval (IR). More and more studies have begun to address this problem from fine-grained document modeling. However, most of them focus on context-independent passage-level relevance signals and ignore the context information. In this paper, we investigate how information gain accumulates with passages and propose the context-aware Passage Cumulative Gain (PCG). The fine-grained PCG avoids the need to split documents into independent passages. We investigate PCG patterns at the document level (DPCG) and the query level (QPCG). Based on the patterns, we propose a BERT-based sequential model called Passage-level Cumulative Gain Model (PCGM) and show that PCGM can effectively predict PCG sequences. Finally, we apply PCGM to the document ranking task using two approaches. The first one is leveraging DPCG sequences to estimate the gain of an individual document. Experimental results on two public ad hoc retrieval datasets show that PCGM outperforms most existing ranking models. The second one considers the cross-document effects and leverages QPCG sequences to estimate the marginal relevance. Experimental results show that predicted results are highly consistent with users’ preferences. We believe that this work contributes to improving ranking performance and providing more explainability for document ranking.

源语言英语
页(从-至)814-838
页数25
期刊Journal of Computer Science and Technology
37
4
DOI
出版状态已出版 - 7月 2022
已对外发布

指纹

探究 'Leveraging Document-Level and Query-Level Passage Cumulative Gain for Document Ranking' 的科研主题。它们共同构成独一无二的指纹。

引用此

Wu, Z. J., Liu, Y. Q., Mao, J. X., Zhang, M., & Ma, S. P. (2022). Leveraging Document-Level and Query-Level Passage Cumulative Gain for Document Ranking. Journal of Computer Science and Technology, 37(4), 814-838. https://doi.org/10.1007/s11390-022-2031-y