Regularized locality preserving indexing via spectral regression

Deng Cai, Xiaofei He, Wei Vivian Zhang, Jiawei Han

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We consider the problem of document indexing and representation. Recently, Locality Preserving Indexing (LPI) was proposed for learning a compact document subspace. Different from Latent Semantic Indexing (LSI) which is optimal in the sense of global Euclidean structure, LPI is optimal in the sense of local manifold structure. However, LPI is not efficient in time and memory which makes it difficult to be applied to very large data set. Specifically, the computation of LPI involves eigen-decompositions of two dense matrices which is expensive. In this paper, we propose a new algorithm called Regularized Locality Preserving Indexing (RLPI). Benefit from recent progresses on spectral graph analysis, we cast the original LPI algorithm into a regression framework which enable us to avoid eigen-decomposition of dense matrices. Also, with the regression based framework, different kinds of regularizers can be naturally incorporated into our algorithm which makes it more flexible. Extensive experimental results show that RLPI obtains similar or better results comparing to LPI and it is significantly faster, which makes it an efficient and effective data preprocessing method for large scale text clustering, classification and retrieval.

Original languageEnglish (US)
Title of host publicationCIKM 2007 - Proceedings of the 16th ACM Conference on Information and Knowledge Management
Pages741-750
Number of pages10
DOIs
StatePublished - 2007
Event16th ACM Conference on Information and Knowledge Management, CIKM 2007 - Lisboa, Portugal
Duration: Nov 6 2007Nov 9 2007

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings

Other

Other16th ACM Conference on Information and Knowledge Management, CIKM 2007
Country/TerritoryPortugal
CityLisboa
Period11/6/0711/9/07

Keywords

  • Dimensionality reduction
  • Document representation and indexing
  • Regularized locality preserving indexing

ASJC Scopus subject areas

  • General Decision Sciences
  • General Business, Management and Accounting

Fingerprint

Dive into the research topics of 'Regularized locality preserving indexing via spectral regression'. Together they form a unique fingerprint.

Cite this