Latent semantic analysis for multiple-type interrelated data objects

Xuanhui Wang, Jian Tao Sun, Zheng Chen, Cheng Xiang Zhai

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Co-occurrence data is quite common in many real applications. Latent Semantic Analysis (LSA) has been successfully used to identify semantic relations in such data. However, LSA can only handle a single co-occurrence relationship between two types of objects. In practical applications, there are many cases where multiple types of objects exist and any pair of these objects could have a pairwise co-occurrence relation. All these co-occurrence relations can be exploited to alleviate data sparseness or to represent objects more meaningfully. In this paper, we propose a novel algorithm, M-LSA, which conducts latent semantic analysis by incorporating all pairwise co-occurrences among multiple types of objects. Based on the mutual reinforcement principle, M-LSA identifies the most salient concepts among the co-occurrence data and represents all the objects in a unified semantic space. M-LSA is general and we show that several variants of LSA are special cases of our algorithm. Experiment results show that M-LSA outperforms LSA on multiple applications, including collaborative filtering, text clustering, and text categorization.

Original languageEnglish (US)
Title of host publicationProceedings of the Twenty-Ninth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
PublisherAssociation for Computing Machinery
Pages236-243
Number of pages8
ISBN (Print)1595933697, 9781595933690
DOIs
StatePublished - 2006
Event29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval - Seatttle, WA, United States
Duration: Aug 6 2006Aug 11 2006

Publication series

NameProceedings of the Twenty-Ninth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
Volume2006

Other

Other29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
CountryUnited States
CitySeatttle, WA
Period8/6/068/11/06

Keywords

  • LSA
  • M-LSA
  • Multiple-type
  • Mutual reinforcement principle

ASJC Scopus subject areas

  • Engineering(all)
  • Information Systems
  • Software
  • Applied Mathematics

Fingerprint Dive into the research topics of 'Latent semantic analysis for multiple-type interrelated data objects'. Together they form a unique fingerprint.

Cite this