TY - GEN
T1 - Regularized locality preserving indexing via spectral regression
AU - Cai, Deng
AU - He, Xiaofei
AU - Zhang, Wei Vivian
AU - Han, Jiawei
PY - 2007
Y1 - 2007
N2 - We consider the problem of document indexing and representation. Recently, Locality Preserving Indexing (LPI) was proposed for learning a compact document subspace. Different from Latent Semantic Indexing (LSI) which is optimal in the sense of global Euclidean structure, LPI is optimal in the sense of local manifold structure. However, LPI is not efficient in time and memory which makes it difficult to be applied to very large data set. Specifically, the computation of LPI involves eigen-decompositions of two dense matrices which is expensive. In this paper, we propose a new algorithm called Regularized Locality Preserving Indexing (RLPI). Benefit from recent progresses on spectral graph analysis, we cast the original LPI algorithm into a regression framework which enable us to avoid eigen-decomposition of dense matrices. Also, with the regression based framework, different kinds of regularizers can be naturally incorporated into our algorithm which makes it more flexible. Extensive experimental results show that RLPI obtains similar or better results comparing to LPI and it is significantly faster, which makes it an efficient and effective data preprocessing method for large scale text clustering, classification and retrieval.
AB - We consider the problem of document indexing and representation. Recently, Locality Preserving Indexing (LPI) was proposed for learning a compact document subspace. Different from Latent Semantic Indexing (LSI) which is optimal in the sense of global Euclidean structure, LPI is optimal in the sense of local manifold structure. However, LPI is not efficient in time and memory which makes it difficult to be applied to very large data set. Specifically, the computation of LPI involves eigen-decompositions of two dense matrices which is expensive. In this paper, we propose a new algorithm called Regularized Locality Preserving Indexing (RLPI). Benefit from recent progresses on spectral graph analysis, we cast the original LPI algorithm into a regression framework which enable us to avoid eigen-decomposition of dense matrices. Also, with the regression based framework, different kinds of regularizers can be naturally incorporated into our algorithm which makes it more flexible. Extensive experimental results show that RLPI obtains similar or better results comparing to LPI and it is significantly faster, which makes it an efficient and effective data preprocessing method for large scale text clustering, classification and retrieval.
KW - Dimensionality reduction
KW - Document representation and indexing
KW - Regularized locality preserving indexing
UR - http://www.scopus.com/inward/record.url?scp=63449101804&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=63449101804&partnerID=8YFLogxK
U2 - 10.1145/1321440.1321544
DO - 10.1145/1321440.1321544
M3 - Conference contribution
AN - SCOPUS:63449101804
SN - 9781595938039
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 741
EP - 750
BT - CIKM 2007 - Proceedings of the 16th ACM Conference on Information and Knowledge Management
T2 - 16th ACM Conference on Information and Knowledge Management, CIKM 2007
Y2 - 6 November 2007 through 9 November 2007
ER -