Sparse overcomplete latent variable decomposition of counts data

Madhusudana Shashanka, Bhiksha Raj, Paris Smaragdis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

An important problemin many fields is the analysis of counts data to extractmeaningful latent components. Methods like Probabilistic Latent Semantic Analysis (PLSA) and Latent Dirichlet Allocation (LDA) have been proposed for this purpose. However, they are limited in the number of components they can extract and lack an explicit provision to control the "expressiveness" of the extracted components. In this paper, we present a learning formulation to address these limitations by employing the notion of sparsity. We start with the PLSA framework and use an entropic prior in a maximum a posteriori formulation to enforce sparsity. We show that this allows the extraction of overcomplete sets of latent components which better characterize the data. We present experimental evidence of the utility of such representations.

Original languageEnglish (US)
Title of host publicationAdvances in Neural Information Processing Systems 20 - Proceedings of the 2007 Conference
PublisherNeural Information Processing Systems
ISBN (Print)160560352X, 9781605603520
StatePublished - 2008
Externally publishedYes
Event21st Annual Conference on Neural Information Processing Systems, NIPS 2007 - Vancouver, BC, Canada
Duration: Dec 3 2007Dec 6 2007

Publication series

NameAdvances in Neural Information Processing Systems 20 - Proceedings of the 2007 Conference

Conference

Conference21st Annual Conference on Neural Information Processing Systems, NIPS 2007
Country/TerritoryCanada
CityVancouver, BC
Period12/3/0712/6/07

ASJC Scopus subject areas

  • Information Systems

Fingerprint

Dive into the research topics of 'Sparse overcomplete latent variable decomposition of counts data'. Together they form a unique fingerprint.

Cite this