TY - GEN
T1 - Dirichlet aspect weighting
T2 - 6th International Conference on Data Mining, ICDM 2006
AU - Velivelli, Atulya
AU - Huang, Thomas S.
PY - 2006
Y1 - 2006
N2 - In this paper we address the problem of document retrieval with semantically structured queries - queries where each term has a tagged field label. We introduce Dirichlet Aspect Weighting model which integrates terms from external databases into the query language model in a bayesian learning framework. For this model, the dirichlet prior distribution is governed by parameters which depend on the number of fields in the external databases. This model needs additional examples to be augmented to the semantically structured query. These examples are obtained using pseudo relevance feedback. We formulate a loglikelihood function for the Dirichlet Aspect Weighting model and maximize it using a novel Generalized EM algorithm. Comparison of the results of Dirichlet Aspect Weighting model on TREC 2005 Genomics Track dataset with baseline methods using pseudo relevance feedback, while incorporating terms from external databases shows an improvement.
AB - In this paper we address the problem of document retrieval with semantically structured queries - queries where each term has a tagged field label. We introduce Dirichlet Aspect Weighting model which integrates terms from external databases into the query language model in a bayesian learning framework. For this model, the dirichlet prior distribution is governed by parameters which depend on the number of fields in the external databases. This model needs additional examples to be augmented to the semantically structured query. These examples are obtained using pseudo relevance feedback. We formulate a loglikelihood function for the Dirichlet Aspect Weighting model and maximize it using a novel Generalized EM algorithm. Comparison of the results of Dirichlet Aspect Weighting model on TREC 2005 Genomics Track dataset with baseline methods using pseudo relevance feedback, while incorporating terms from external databases shows an improvement.
UR - http://www.scopus.com/inward/record.url?scp=84878088384&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84878088384&partnerID=8YFLogxK
U2 - 10.1109/ICDM.2006.55
DO - 10.1109/ICDM.2006.55
M3 - Conference contribution
AN - SCOPUS:84878088384
SN - 0769527019
SN - 9780769527017
T3 - Proceedings - IEEE International Conference on Data Mining, ICDM
SP - 633
EP - 644
BT - Proceedings - Sixth International Conference on Data Mining, ICDM 2006
Y2 - 18 December 2006 through 22 December 2006
ER -