A study of per-topic variance on system comparison

Meng Yang, Peng Zhang*, Dawei Song

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Citations (Scopus)

Abstract

Under the notion that the document collection is a sample from a population, the observed per-topic metric (e.g., AP) value varies with different samples, leading to the per-topic variance. The results of the system comparison, such as comparing the ranking of systems according to the summary metric (e.g., MAP) or testing whether there is significant difference between two systems, are affected by the variability of per-topic metric values. In this paper, we study the effect of per-topic variance on the system comparison. To measure such effects, we employ two ranking-based methods, i.e., Error Rate (ER) and Kendall Rank Correlation Coefficient (KRCC), as well as two significance test based methods, namely Achieved Significance Level (ASL) and Estimated Difference (ED). We conduct empirical comparison of TREC participated systems on Robust and Adhoc track, which shows that the effect of per-topic variance on the ranking of systems is not obvious, while the significance test based comparisons are susceptible to the per-topic variance.

Original languageEnglish
Title of host publication41st International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2018
PublisherAssociation for Computing Machinery, Inc
Pages1181-1184
Number of pages4
ISBN (Electronic)9781450356572
DOIs
Publication statusPublished - 27 Jun 2018
Externally publishedYes
Event41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2018 - Ann Arbor, United States
Duration: 8 Jul 201812 Jul 2018

Publication series

Name41st International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2018

Conference

Conference41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2018
Country/TerritoryUnited States
CityAnn Arbor
Period8/07/1812/07/18

Keywords

  • Evaluation
  • Per-topic variance
  • System comparison

Fingerprint

Dive into the research topics of 'A study of per-topic variance on system comparison'. Together they form a unique fingerprint.

Cite this