TY - JOUR
T1 - SRDA
T2 - An efficient algorithm for large scale discriminant analysis
AU - Cai, Deng
AU - He, Xiaofei
AU - Han, Jiawei
N1 - Funding Information:
The work was supported in part by the US National Science Foundation Grants IIS-05-13678 and BDI-05-15813 and MIAS (a DHS Institute of Discrete Science Center for Multimodal Information Access and Synthesis).
PY - 2008/1
Y1 - 2008/1
N2 - Linear Discriminant Analysis (LDA) has been a popular method for extracting features which preserves class separability. The projection functions of LDA are commonly obtained by maximizing the between class covariance and simultaneously minimizing the within class covariance. It has been widely used in many fields of information processing, such as machine learning, data mining, information retrieval, and pattern recognition. However, the computation of LDA involves dense matrices eigen-decomposition which can be computationally expensive both in time and memory. Specifically, LDA has 0(mnt + t3) time complexity and requires O(mn + mt + nt) memory, where m is the number of samples, n is the number of features and t = min(m, n). When both m and n are large, it is infeasible to apply LDA. In this paper, we propose a novel algorithm for discriminant analysis, called Spectral Regression Discriminant Analysis (SRDA). By using spectral graph analysis, SRDA casts discriminant analysis into a regression framework which facilitates both efficient computation and the use of regularization techniques. Specifically, SRDA only needs to solve a set of regularized least squares problems and there is no eigenvector computation involved, which is a huge save of both time and memory. Our theoretical analysis shows that SRDA can be computed with O(ms) time and O(ms) memory, where s(≤ n) is the average number of non-zero features in each sample. Extensive experimental results on four real world data sets demonstrate the effectiveness and efficiency of our algorithm.
AB - Linear Discriminant Analysis (LDA) has been a popular method for extracting features which preserves class separability. The projection functions of LDA are commonly obtained by maximizing the between class covariance and simultaneously minimizing the within class covariance. It has been widely used in many fields of information processing, such as machine learning, data mining, information retrieval, and pattern recognition. However, the computation of LDA involves dense matrices eigen-decomposition which can be computationally expensive both in time and memory. Specifically, LDA has 0(mnt + t3) time complexity and requires O(mn + mt + nt) memory, where m is the number of samples, n is the number of features and t = min(m, n). When both m and n are large, it is infeasible to apply LDA. In this paper, we propose a novel algorithm for discriminant analysis, called Spectral Regression Discriminant Analysis (SRDA). By using spectral graph analysis, SRDA casts discriminant analysis into a regression framework which facilitates both efficient computation and the use of regularization techniques. Specifically, SRDA only needs to solve a set of regularized least squares problems and there is no eigenvector computation involved, which is a huge save of both time and memory. Our theoretical analysis shows that SRDA can be computed with O(ms) time and O(ms) memory, where s(≤ n) is the average number of non-zero features in each sample. Extensive experimental results on four real world data sets demonstrate the effectiveness and efficiency of our algorithm.
KW - Dimensionality reduction
KW - Linear discriminant analysis
KW - Spectral regression
UR - http://www.scopus.com/inward/record.url?scp=36649009540&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=36649009540&partnerID=8YFLogxK
U2 - 10.1109/TKDE.2007.190669
DO - 10.1109/TKDE.2007.190669
M3 - Article
AN - SCOPUS:36649009540
SN - 1041-4347
VL - 20
SP - 1
EP - 12
JO - IEEE Transactions on Knowledge and Data Engineering
JF - IEEE Transactions on Knowledge and Data Engineering
IS - 1
ER -