Heterogeneous graph-based intent learning with queries, web pages and Wikipedia concepts

Xiang Ren, Yujing Wang, Xiao Yu, Jun Yan, Zheng Chen, Jiawei Han

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The problem of learning user search intents has attracted intensive attention from both industry and academia. However, state-of-the-art intent learning algorithms suffer from different drawbacks when only using a single type of data source. For example, query text has difficulty in distinguishing ambiguous queries; search log is bias to the order of search results and users' noisy click behaviors. In this work, we for the first time leverage three types of objects, namely queries, web pages and Wikipedia concepts collaboratively for learning generic search intents and construct a heterogeneous graph to represent multiple types of relationships between them. A novel unsupervised method called heterogeneous graph-based soft-clustering is developed to derive an intent indicator for each object based on the constructed heterogeneous graph. With the proposed co-clustering method, one can enhance the quality of intent understanding by taking advantage of different types of data, which complement each other, and make the implicit intents easier to interpret with explicit knowledge from Wikipedia concepts. Experiments on two real-world datasets demonstrate the power of the proposed method where it achieves a 9.25% improvement in terms of NDCG on search ranking task and a 4.67% enhancement in terms of Rand index on object co-clustering task compared to the best state-of-the-art method.

Original languageEnglish (US)
Title of host publicationWSDM 2014 - Proceedings of the 7th ACM International Conference on Web Search and Data Mining
PublisherAssociation for Computing Machinery
Pages23-32
Number of pages10
ISBN (Print)9781450323512
DOIs
StatePublished - 2014
Event7th ACM International Conference on Web Search and Data Mining, WSDM 2014 - New York, NY, United States
Duration: Feb 24 2014Feb 28 2014

Publication series

NameWSDM 2014 - Proceedings of the 7th ACM International Conference on Web Search and Data Mining

Other

Other7th ACM International Conference on Web Search and Data Mining, WSDM 2014
Country/TerritoryUnited States
CityNew York, NY
Period2/24/142/28/14

Keywords

  • heterogeneous graph clustering
  • search intent
  • wikipedia

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems

Fingerprint

Dive into the research topics of 'Heterogeneous graph-based intent learning with queries, web pages and Wikipedia concepts'. Together they form a unique fingerprint.

Cite this