Ranking Like Human: Global-View Matching via Reinforcement Learning for Answer Selection

Yingxue Zhang, Ping Jian, Ruiying Geng, Yuansheng Song, Fandong Meng

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Answer Selection (AS) is of great importance for open-domain Question Answering (QA). Previous approaches typically model each pair of the question and the candidate answers independently. However, when selecting correct answers from the candidate set, the question is usually too brief to provide enough matching information for the right decision. In this paper, we propose a reinforcement learning framework that utilizes the rich overlapping information among answer candidates to help judge the correctness of each candidate. In particular, we design a policy network, whose state aggregates both the question-candidate matching information and the candidate-candidate matching information through a global-view encoder. Experiments on the benchmark of WikiQA and SelQA demonstrate that our RL framework substantially improves the ranking performance.

Original languageEnglish
Title of host publicationProceedings of the 2019 International Conference on Asian Language Processing, IALP 2019
EditorsMan Lan, Yuanbin Wu, Minghui Dong, Yanfeng Lu, Yan Yang
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages456-461
Number of pages6
ISBN (Electronic)9781728150147
DOIs
Publication statusPublished - Nov 2019
Event23rd International Conference on Asian Language Processing, IALP 2019 - Shanghai, China
Duration: 15 Nov 201917 Nov 2019

Publication series

NameProceedings of the 2019 International Conference on Asian Language Processing, IALP 2019

Conference

Conference23rd International Conference on Asian Language Processing, IALP 2019
Country/TerritoryChina
CityShanghai
Period15/11/1917/11/19

Keywords

  • Answer Selection
  • Reinforcement Learning

Fingerprint

Dive into the research topics of 'Ranking Like Human: Global-View Matching via Reinforcement Learning for Answer Selection'. Together they form a unique fingerprint.

Cite this