TY - GEN
T1 - Scalable moment-based inference for latent Dirichlet allocation
AU - Wang, Chi
AU - Liu, Xueqing
AU - Song, Yanglei
AU - Han, Jiawei
PY - 2014
Y1 - 2014
N2 - Topic models such as Latent Dirichlet Allocation have been useful text analysis methods of wide interest. Recently, moment-based inference with provable performance has been proposed for topic models. Compared with inference algorithms that approximate the maximum likelihood objective, moment-based inference has theoretical guarantee in recovering model parameters. One such inference method is tensor orthogonal decomposition, which requires only mild assumptions for exact recovery of topics. However, it suffers from scalability issue due to creation of dense, high-dimensional tensors. In this work, we propose a speedup technique by leveraging the special structure of the tensors. It is efficient in both time and space, and only requires scanning the corpus twice. It improves over the state-of-the-art inference algorithm by one to three orders of magnitude, while preserving equal inference ability.
AB - Topic models such as Latent Dirichlet Allocation have been useful text analysis methods of wide interest. Recently, moment-based inference with provable performance has been proposed for topic models. Compared with inference algorithms that approximate the maximum likelihood objective, moment-based inference has theoretical guarantee in recovering model parameters. One such inference method is tensor orthogonal decomposition, which requires only mild assumptions for exact recovery of topics. However, it suffers from scalability issue due to creation of dense, high-dimensional tensors. In this work, we propose a speedup technique by leveraging the special structure of the tensors. It is efficient in both time and space, and only requires scanning the corpus twice. It improves over the state-of-the-art inference algorithm by one to three orders of magnitude, while preserving equal inference ability.
UR - http://www.scopus.com/inward/record.url?scp=84907008538&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84907008538&partnerID=8YFLogxK
U2 - 10.1007/978-3-662-44845-8_19
DO - 10.1007/978-3-662-44845-8_19
M3 - Conference contribution
AN - SCOPUS:84907008538
SN - 9783662448441
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 290
EP - 305
BT - Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2014, Proceedings
PB - Springer
T2 - European Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2014
Y2 - 15 September 2014 through 19 September 2014
ER -