TEXplorer: Keyword-based object search and exploration in multidimensional text databases

Bo Zhao, Xide Lin, Bolin Ding, Jiawei Han

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We propose a novel system TEXplorer that integrates keyword-based object ranking with the aggregation and exploration power of OLAP in a text database with rich structured attributes available, e.g., a product review database. TEXplorer can be implemented within a multi-dimensional text database, where each row is associated with structural dimensions (attributes) and text data (e.g., a document). The system utilizes the text cube data model, where a cell aggregates a set of documents with matching values in a subset of dimensions. Cells in a text cube capture different levels of summarization of the documents, and can represent objects at different conceptual levels. Users query the system by submitting a set of keywords. Instead of returning a ranked list of all the cells, we propose a keyword-based interactive exploration framework that could offer flexible OLAP navigational guides and help users identify the levels and objects they are interested in. A novel significance measure of dimensions is proposed based on the distribution of IR relevance of cells. During each interaction stage, dimensions are ranked according to their significance scores to guide drilling down; and cells in the same cuboids are ranked according to their relevance to guide exploration. We propose efficient algorithms and materialization strategies for ranking top-k dimensions and cells. Finally, extensive experiments on real datasets demonstrate the efficiency and effectiveness of our approach.

Original languageEnglish (US)
Title of host publicationCIKM'11 - Proceedings of the 2011 ACM International Conference on Information and Knowledge Management
Pages1709-1718
Number of pages10
DOIs
StatePublished - Dec 13 2011
Event20th ACM Conference on Information and Knowledge Management, CIKM'11 - Glasgow, United Kingdom
Duration: Oct 24 2011Oct 28 2011

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings

Other

Other20th ACM Conference on Information and Knowledge Management, CIKM'11
CountryUnited Kingdom
CityGlasgow
Period10/24/1110/28/11

Keywords

  • faceted search
  • keyword search
  • object exploration
  • object search

ASJC Scopus subject areas

  • Decision Sciences(all)
  • Business, Management and Accounting(all)

Fingerprint Dive into the research topics of 'TEXplorer: Keyword-based object search and exploration in multidimensional text databases'. Together they form a unique fingerprint.

  • Cite this

    Zhao, B., Lin, X., Ding, B., & Han, J. (2011). TEXplorer: Keyword-based object search and exploration in multidimensional text databases. In CIKM'11 - Proceedings of the 2011 ACM International Conference on Information and Knowledge Management (pp. 1709-1718). (International Conference on Information and Knowledge Management, Proceedings). https://doi.org/10.1145/2063576.2063822