TY - GEN
T1 - Efficient manifold preserving audio source separation using locality sensitive hashing
AU - Kim, Minje
AU - Smaragdis, Paris
AU - Mysore, Gautham J.
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/8/4
Y1 - 2015/8/4
N2 - We propose an efficient technique to learn probabilistic hierarchical topic models that are designed to preserve the manifold structure of audio data. The consideration of the data manifold is important, as it has been shown to provide superior performance in certain audio applications such as source separation. However, the high computational cost of a sparse encoding step due to the requirement of a large dictionary prevents it from being used in real-world applications such as real-time speech enhancement and the analysis of big audio data. In order to achieve a substantial speed-up of this step, while still respecting the data manifold, we propose to harmonize a particular type of locality sensitive hashing with the hierarchical topic model. The proposed use of hashing can reduce the computational complexity of the sparse encoding by providing candidates of non-zero activations, where the candidate set is built based on Hamming distance. The hashing step is followed by comprehensive sparse coding that considers those candidates only, rather than the entire dictionary. Experimental results show that the proposed hashing technique can provide audio source separation results comparable to the similar system without hashing, but with significantly less and cheaper computation.
AB - We propose an efficient technique to learn probabilistic hierarchical topic models that are designed to preserve the manifold structure of audio data. The consideration of the data manifold is important, as it has been shown to provide superior performance in certain audio applications such as source separation. However, the high computational cost of a sparse encoding step due to the requirement of a large dictionary prevents it from being used in real-world applications such as real-time speech enhancement and the analysis of big audio data. In order to achieve a substantial speed-up of this step, while still respecting the data manifold, we propose to harmonize a particular type of locality sensitive hashing with the hierarchical topic model. The proposed use of hashing can reduce the computational complexity of the sparse encoding by providing candidates of non-zero activations, where the candidate set is built based on Hamming distance. The hashing step is followed by comprehensive sparse coding that considers those candidates only, rather than the entire dictionary. Experimental results show that the proposed hashing technique can provide audio source separation results comparable to the similar system without hashing, but with significantly less and cheaper computation.
KW - Locality Sensitive Hashing
KW - Source Separation
KW - Topic Modeling
KW - Winner Take All Hashing
UR - http://www.scopus.com/inward/record.url?scp=84946046151&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84946046151&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2015.7178015
DO - 10.1109/ICASSP.2015.7178015
M3 - Conference contribution
AN - SCOPUS:84946046151
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 479
EP - 483
BT - 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015
Y2 - 19 April 2014 through 24 April 2014
ER -