TY - GEN
T1 - Random features for Kernel Deep Convex Network
AU - Huang, Po Sen
AU - Deng, Li
AU - Hasegawa-Johnson, Mark
AU - He, Xiaodong
N1 - Funding Information:
This work was supported by the Natural Science Foundation of Xinjiang Uygur Autonomous Region (No. 2022D01B27), the Tianchi PhD Program in the Xinjiang Uygur Autonomous Region (No. 390000017) and the crosswise tasks project (No. 2521HXKT1). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
PY - 2013/10/18
Y1 - 2013/10/18
N2 - The recently developed deep learning architecture, a kernel version of the deep convex network (K-DCN), is improved to address the scalability problem when the training and testing samples become very large. We have developed a solution based on the use of random Fourier features, which possess the strong theoretical property of approximating the Gaussian kernel while rendering efficient computation in both training and evaluation of the K-DCN with large training samples. We empirically demonstrate that just like the conventional K-DCN exploiting rigorous Gaussian kernels, the use of random Fourier features also enables successful stacking of kernel modules to form a deep architecture. Our evaluation experiments on phone recognition and speech understanding tasks both show the computational efficiency of the K-DCN which makes use of random features. With sufficient depth in the K-DCN, the phone recognition accuracy and slot-filling accuracy are shown to be comparable or slightly higher than the K-DCN with Gaussian kernels while significant computational saving has been achieved.
AB - The recently developed deep learning architecture, a kernel version of the deep convex network (K-DCN), is improved to address the scalability problem when the training and testing samples become very large. We have developed a solution based on the use of random Fourier features, which possess the strong theoretical property of approximating the Gaussian kernel while rendering efficient computation in both training and evaluation of the K-DCN with large training samples. We empirically demonstrate that just like the conventional K-DCN exploiting rigorous Gaussian kernels, the use of random Fourier features also enables successful stacking of kernel modules to form a deep architecture. Our evaluation experiments on phone recognition and speech understanding tasks both show the computational efficiency of the K-DCN which makes use of random features. With sufficient depth in the K-DCN, the phone recognition accuracy and slot-filling accuracy are shown to be comparable or slightly higher than the K-DCN with Gaussian kernels while significant computational saving has been achieved.
KW - deep learning
KW - kernel regression
KW - random features
KW - spoken language understanding
UR - http://www.scopus.com/inward/record.url?scp=84890480288&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84890480288&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2013.6638237
DO - 10.1109/ICASSP.2013.6638237
M3 - Conference contribution
AN - SCOPUS:84890480288
SN - 9781479903566
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 3143
EP - 3147
BT - 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Proceedings
T2 - 2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013
Y2 - 26 May 2013 through 31 May 2013
ER -