A study of term proximity and document weighting normalization in Pseudo Relevance Feedback - UIUC at TREC 2009 Million Query Track

Yuanhua Lv, Jing He, V. G. Vinod Vydiswaran, Kavita Ganesan, Cheng Xiang Zhai

Research output: Research - peer-reviewArticle

Abstract

In this paper, we report our experiments in the TREC 2009 Million Query Track. Our first line of study is on proximity-based feedback, in which we propose a positional relevance model (PRM) to exploit term proximity evidence so as to assign more weights to expansion words that are closer to query words in feedback documents. The second line of study is to improve the weighting of feedback documents in the relevance model by using a regression-based method to approximate the probability of relevance (and thus the name RegRM). In the third line of study, we test a supervised approach for query classification. Besides, we also evaluate a selective pseudo feedback strategy which stops pseudo feedback for precision-oriented queries and only uses it for recall-oriented ones. The proposed PRM has shown clear improvements over the relevance model for pseudo feedback, suggesting that capturing the term proximity heuristic appropriately could lead to a better feedback model. RegRM performs as well as relevance model, but no noticeable improvement is observed. Unfortunately, the proposed query classification methods appear to not work well. The results also show that the proposed selective pseudo feedback may not work well, since precision-oriented queries can also benefit from pseudo feedback, though not as much as recall-oriented queries.

LanguageEnglish (US)
JournalNIST Special Publication
StatePublished - 2009

Fingerprint

Feedback
Experiments

ASJC Scopus subject areas

  • Engineering(all)

Cite this

A study of term proximity and document weighting normalization in Pseudo Relevance Feedback - UIUC at TREC 2009 Million Query Track. / Lv, Yuanhua; He, Jing; Vinod Vydiswaran, V. G.; Ganesan, Kavita; Zhai, Cheng Xiang.

In: NIST Special Publication, 2009.

Research output: Research - peer-reviewArticle

@article{5a87e19d6d2d4b0ba2ae2ef5dbe1af6e,
title = "A study of term proximity and document weighting normalization in Pseudo Relevance Feedback - UIUC at TREC 2009 Million Query Track",
abstract = "In this paper, we report our experiments in the TREC 2009 Million Query Track. Our first line of study is on proximity-based feedback, in which we propose a positional relevance model (PRM) to exploit term proximity evidence so as to assign more weights to expansion words that are closer to query words in feedback documents. The second line of study is to improve the weighting of feedback documents in the relevance model by using a regression-based method to approximate the probability of relevance (and thus the name RegRM). In the third line of study, we test a supervised approach for query classification. Besides, we also evaluate a selective pseudo feedback strategy which stops pseudo feedback for precision-oriented queries and only uses it for recall-oriented ones. The proposed PRM has shown clear improvements over the relevance model for pseudo feedback, suggesting that capturing the term proximity heuristic appropriately could lead to a better feedback model. RegRM performs as well as relevance model, but no noticeable improvement is observed. Unfortunately, the proposed query classification methods appear to not work well. The results also show that the proposed selective pseudo feedback may not work well, since precision-oriented queries can also benefit from pseudo feedback, though not as much as recall-oriented queries.",
author = "Yuanhua Lv and Jing He and {Vinod Vydiswaran}, {V. G.} and Kavita Ganesan and Zhai, {Cheng Xiang}",
year = "2009",
journal = "NIST Special Publication",
issn = "1048-776X",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - A study of term proximity and document weighting normalization in Pseudo Relevance Feedback - UIUC at TREC 2009 Million Query Track

AU - Lv,Yuanhua

AU - He,Jing

AU - Vinod Vydiswaran,V. G.

AU - Ganesan,Kavita

AU - Zhai,Cheng Xiang

PY - 2009

Y1 - 2009

N2 - In this paper, we report our experiments in the TREC 2009 Million Query Track. Our first line of study is on proximity-based feedback, in which we propose a positional relevance model (PRM) to exploit term proximity evidence so as to assign more weights to expansion words that are closer to query words in feedback documents. The second line of study is to improve the weighting of feedback documents in the relevance model by using a regression-based method to approximate the probability of relevance (and thus the name RegRM). In the third line of study, we test a supervised approach for query classification. Besides, we also evaluate a selective pseudo feedback strategy which stops pseudo feedback for precision-oriented queries and only uses it for recall-oriented ones. The proposed PRM has shown clear improvements over the relevance model for pseudo feedback, suggesting that capturing the term proximity heuristic appropriately could lead to a better feedback model. RegRM performs as well as relevance model, but no noticeable improvement is observed. Unfortunately, the proposed query classification methods appear to not work well. The results also show that the proposed selective pseudo feedback may not work well, since precision-oriented queries can also benefit from pseudo feedback, though not as much as recall-oriented queries.

AB - In this paper, we report our experiments in the TREC 2009 Million Query Track. Our first line of study is on proximity-based feedback, in which we propose a positional relevance model (PRM) to exploit term proximity evidence so as to assign more weights to expansion words that are closer to query words in feedback documents. The second line of study is to improve the weighting of feedback documents in the relevance model by using a regression-based method to approximate the probability of relevance (and thus the name RegRM). In the third line of study, we test a supervised approach for query classification. Besides, we also evaluate a selective pseudo feedback strategy which stops pseudo feedback for precision-oriented queries and only uses it for recall-oriented ones. The proposed PRM has shown clear improvements over the relevance model for pseudo feedback, suggesting that capturing the term proximity heuristic appropriately could lead to a better feedback model. RegRM performs as well as relevance model, but no noticeable improvement is observed. Unfortunately, the proposed query classification methods appear to not work well. The results also show that the proposed selective pseudo feedback may not work well, since precision-oriented queries can also benefit from pseudo feedback, though not as much as recall-oriented queries.

UR - http://www.scopus.com/inward/record.url?scp=84873437893&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84873437893&partnerID=8YFLogxK

M3 - Article

JO - NIST Special Publication

T2 - NIST Special Publication

JF - NIST Special Publication

SN - 1048-776X

ER -