TY - GEN
T1 - Unifying learning to rank and domain adaptation
T2 - 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2014
AU - Zhou, Mianwei
AU - Chang, Kevin C.
N1 - Copyright:
Copyright 2014 Elsevier B.V., All rights reserved.
PY - 2014
Y1 - 2014
N2 - For document scoring, although learning to rank and domain adaptation are treated as two different problems in previous works, we discover that they actually share the same challenge of adapting keyword contribution across different queries or domains. In this paper, we propose to study the cross-task document scoring problem, where a task refers to a query to rank or a domain to adapt to, as the first attempt to unify these two problems. Existing solutions for learning to rank and domain adaptation either leave the heavy burden of adapting keyword contribution to feature designers, or are difficult to be generalized. To resolve such limitations, we abstract the keyword scoring principle, pointing out that the contribution of a keyword essentially depends on, first, its importance to a task and, second, its importance to the document. For determining these two aspects of keyword importance, we further propose the concept of feature decoupling, suggesting using two types of easy-to-design features: meta-features and intra-features. Towards learning a scorer based on the decoupled features, we require that our framework fulfill inferred sparsity to eliminate the interference of noisy keywords, and employ distant supervision to tackle the lack of keyword labels. We propose the Tree-structured Boltzmann Machine (T-RBM), a novel two-stage Markov Network, as our solution. Experiments on three different applications confirm the effectiveness of T-RBM, which achieves significant improvement compared with four state-of-the-art baseline methods.
AB - For document scoring, although learning to rank and domain adaptation are treated as two different problems in previous works, we discover that they actually share the same challenge of adapting keyword contribution across different queries or domains. In this paper, we propose to study the cross-task document scoring problem, where a task refers to a query to rank or a domain to adapt to, as the first attempt to unify these two problems. Existing solutions for learning to rank and domain adaptation either leave the heavy burden of adapting keyword contribution to feature designers, or are difficult to be generalized. To resolve such limitations, we abstract the keyword scoring principle, pointing out that the contribution of a keyword essentially depends on, first, its importance to a task and, second, its importance to the document. For determining these two aspects of keyword importance, we further propose the concept of feature decoupling, suggesting using two types of easy-to-design features: meta-features and intra-features. Towards learning a scorer based on the decoupled features, we require that our framework fulfill inferred sparsity to eliminate the interference of noisy keywords, and employ distant supervision to tackle the lack of keyword labels. We propose the Tree-structured Boltzmann Machine (T-RBM), a novel two-stage Markov Network, as our solution. Experiments on three different applications confirm the effectiveness of T-RBM, which achieves significant improvement compared with four state-of-the-art baseline methods.
KW - domain adaptation
KW - feature decoupling
KW - learning to rank
KW - tree-structured restricted boltzmann machine
UR - http://www.scopus.com/inward/record.url?scp=84907021702&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84907021702&partnerID=8YFLogxK
U2 - 10.1145/2623330.2623739
DO - 10.1145/2623330.2623739
M3 - Conference contribution
AN - SCOPUS:84907021702
SN - 9781450329569
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 781
EP - 790
BT - KDD 2014 - Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
PB - Association for Computing Machinery
Y2 - 24 August 2014 through 27 August 2014
ER -