TY - GEN
T1 - Reader-Guided Passage Reranking for Open-Domain Question Answering
AU - Mao, Yuning
AU - He, Pengcheng
AU - Liu, Xiaodong
AU - Shen, Yelong
AU - Gao, Jianfeng
AU - Han, Jiawei
AU - Chen, Weizhu
N1 - Publisher Copyright:
© 2021 Association for Computational Linguistics
PY - 2021
Y1 - 2021
N2 - Current open-domain question answering systems often follow a Retriever-Reader architecture, where the retriever first retrieves relevant passages and the reader then reads the retrieved passages to form an answer. In this paper, we propose a simple and effective passage reranking method, named Reader-guIDEd Reranker (RIDER), which does not involve training and reranks the retrieved passages solely based on the top predictions of the reader before reranking. We show that RIDER, despite its simplicity, achieves 10 to 20 absolute gains in top-1 retrieval accuracy and 1 to 4 Exact Match (EM) gains without refining the retriever or reader. In addition, RIDER, without any training, outperforms state-of-the-art transformer-based supervised rerankers. Remarkably, RIDER achieves 48.3 EM on the Natural Questions dataset and 66.4 EM on the TriviaQA dataset when only 1,024 tokens (7.8 passages on average) are used as the reader input after passage reranking.
AB - Current open-domain question answering systems often follow a Retriever-Reader architecture, where the retriever first retrieves relevant passages and the reader then reads the retrieved passages to form an answer. In this paper, we propose a simple and effective passage reranking method, named Reader-guIDEd Reranker (RIDER), which does not involve training and reranks the retrieved passages solely based on the top predictions of the reader before reranking. We show that RIDER, despite its simplicity, achieves 10 to 20 absolute gains in top-1 retrieval accuracy and 1 to 4 Exact Match (EM) gains without refining the retriever or reader. In addition, RIDER, without any training, outperforms state-of-the-art transformer-based supervised rerankers. Remarkably, RIDER achieves 48.3 EM on the Natural Questions dataset and 66.4 EM on the TriviaQA dataset when only 1,024 tokens (7.8 passages on average) are used as the reader input after passage reranking.
UR - http://www.scopus.com/inward/record.url?scp=85123928739&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85123928739&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85123928739
T3 - Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021
SP - 344
EP - 350
BT - Findings of the Association for Computational Linguistics
A2 - Zong, Chengqing
A2 - Xia, Fei
A2 - Li, Wenjie
A2 - Navigli, Roberto
PB - Association for Computational Linguistics (ACL)
T2 - Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021
Y2 - 1 August 2021 through 6 August 2021
ER -