BIT and MSRA at TREC KBA CCR Track 2013

Jingang Wang; Dandan Song; Chin Yew Lin; Lejian Liao

BIT and MSRA at TREC KBA CCR Track 2013

Jingang Wang, Dandan Song^*, Chin Yew Lin, Lejian Liao

^*此作品的通讯作者

计算机学院

科研成果: 会议稿件 › 论文 › 同行评审

14 引用（Scopus）

摘要

Our strategy for TREC KBA CCR track is to first retrieve as many vital or documents as possible and then apply more sophisticated classification and ranking methods to differentiate vital from useful documents. We submitted 10 runs generated by 3 approaches: question expansion, classification and learning to rank. Query expansion is an unsupervised baseline, in which we combine entities' names and their related entities' names as phrase queries to retrieve relevant documents. This baseline outperforms the overall median and mean submissions. The system performance is further improved by supervised classification and learning to rank methods. We mainly exploit three kinds of external resources to construct the features in supervised learning: (i) entry pages of Wikipedia entities or profile pages of Twitter entities, (ii) existing citations in the Wikipedia page of an entity, and (iii) burst of Wikipedia page views of an entity. In vital + useful task, one of our ranking-based methods achieves the best result among all participants. In vital only task, one of our classification-based methods achieve the overall best result.

源语言	英语
出版状态	已出版 - 2013
活动	22nd Text REtrieval Conference, TREC 2013 - Gaithersburg, 美国期限: 19 11月 2013 → 22 11月 2013

会议

会议	22nd Text REtrieval Conference, TREC 2013
国家/地区	美国
市	Gaithersburg
时期	19/11/13 → 22/11/13

其它文件与链接

链接到 Scopus 的出版物

引用此

Wang, J., Song, D., Lin, C. Y., & Liao, L. (2013). BIT and MSRA at TREC KBA CCR Track 2013. 论文发表于 22nd Text REtrieval Conference, TREC 2013, Gaithersburg, 美国.

@conference{1ae3da598f7548349bf527a280ac7f6a,

title = "BIT and MSRA at TREC KBA CCR Track 2013",

abstract = "Our strategy for TREC KBA CCR track is to first retrieve as many vital or documents as possible and then apply more sophisticated classification and ranking methods to differentiate vital from useful documents. We submitted 10 runs generated by 3 approaches: question expansion, classification and learning to rank. Query expansion is an unsupervised baseline, in which we combine entities' names and their related entities' names as phrase queries to retrieve relevant documents. This baseline outperforms the overall median and mean submissions. The system performance is further improved by supervised classification and learning to rank methods. We mainly exploit three kinds of external resources to construct the features in supervised learning: (i) entry pages of Wikipedia entities or profile pages of Twitter entities, (ii) existing citations in the Wikipedia page of an entity, and (iii) burst of Wikipedia page views of an entity. In vital + useful task, one of our ranking-based methods achieves the best result among all participants. In vital only task, one of our classification-based methods achieve the overall best result.",

author = "Jingang Wang and Dandan Song and Lin, {Chin Yew} and Lejian Liao",

note = "Publisher Copyright: {\textcopyright} 2013 22nd Text REtrieval Conference, TREC 2013 - Proceedings. All Rights Reserved.; 22nd Text REtrieval Conference, TREC 2013 ; Conference date: 19-11-2013 Through 22-11-2013",

year = "2013",

language = "English",

}

TY - CONF

T1 - BIT and MSRA at TREC KBA CCR Track 2013

AU - Wang, Jingang

AU - Song, Dandan

AU - Lin, Chin Yew

AU - Liao, Lejian

PY - 2013

Y1 - 2013

N2 - Our strategy for TREC KBA CCR track is to first retrieve as many vital or documents as possible and then apply more sophisticated classification and ranking methods to differentiate vital from useful documents. We submitted 10 runs generated by 3 approaches: question expansion, classification and learning to rank. Query expansion is an unsupervised baseline, in which we combine entities' names and their related entities' names as phrase queries to retrieve relevant documents. This baseline outperforms the overall median and mean submissions. The system performance is further improved by supervised classification and learning to rank methods. We mainly exploit three kinds of external resources to construct the features in supervised learning: (i) entry pages of Wikipedia entities or profile pages of Twitter entities, (ii) existing citations in the Wikipedia page of an entity, and (iii) burst of Wikipedia page views of an entity. In vital + useful task, one of our ranking-based methods achieves the best result among all participants. In vital only task, one of our classification-based methods achieve the overall best result.

AB - Our strategy for TREC KBA CCR track is to first retrieve as many vital or documents as possible and then apply more sophisticated classification and ranking methods to differentiate vital from useful documents. We submitted 10 runs generated by 3 approaches: question expansion, classification and learning to rank. Query expansion is an unsupervised baseline, in which we combine entities' names and their related entities' names as phrase queries to retrieve relevant documents. This baseline outperforms the overall median and mean submissions. The system performance is further improved by supervised classification and learning to rank methods. We mainly exploit three kinds of external resources to construct the features in supervised learning: (i) entry pages of Wikipedia entities or profile pages of Twitter entities, (ii) existing citations in the Wikipedia page of an entity, and (iii) burst of Wikipedia page views of an entity. In vital + useful task, one of our ranking-based methods achieves the best result among all participants. In vital only task, one of our classification-based methods achieve the overall best result.

UR - http://www.scopus.com/inward/record.url?scp=85018093307&partnerID=8YFLogxK

M3 - Paper

AN - SCOPUS:85018093307

T2 - 22nd Text REtrieval Conference, TREC 2013

Y2 - 19 November 2013 through 22 November 2013

ER -

BIT and MSRA at TREC KBA CCR Track 2013

摘要

会议

其它文件与链接

指纹

引用此