TY - GEN
T1 - Document-topic hierarchies from document graphs
AU - Weninger, Tim
AU - Bisk, Yonatan
AU - Han, Jiawei
N1 - The authors would like to thank Karl Jakob for his help in designing and building the bioreactor and Marcy Wong, Thomas Quinn and David Wendt for many helpful discussions. This work was supported by the Swiss Federal Office for Education and Science (B.B.W.) under the Fifth European Framework Growth Program (SCAFCART, grant Nr. 99.0291).
PY - 2012
Y1 - 2012
N2 - Topic taxonomies present a multi-level view of a document collection, where general topics live towards the top of the taxonomy and more specific topics live towards the bottom. Topic taxonomies allow users to quickly drill down into their topic of interest to find documents. We show that hierarchies of documents, where documents live at the inner nodes of the hierarchy-tree can also be inferred by combining document text with inter-document links. We present a Bayesian generative model by which an explicit hierarchy of documents is created. Experiments on three document-graph data sets shows that the generated document hierarchies are able to fit the observed data, and that the levels in the constructed document hierarchy represent practical groupings.
AB - Topic taxonomies present a multi-level view of a document collection, where general topics live towards the top of the taxonomy and more specific topics live towards the bottom. Topic taxonomies allow users to quickly drill down into their topic of interest to find documents. We show that hierarchies of documents, where documents live at the inner nodes of the hierarchy-tree can also be inferred by combining document text with inter-document links. We present a Bayesian generative model by which an explicit hierarchy of documents is created. Experiments on three document-graph data sets shows that the generated document hierarchies are able to fit the observed data, and that the levels in the constructed document hierarchy represent practical groupings.
KW - bayesian generative models
KW - hierarchical clustering
KW - model evaluation
KW - topic models
UR - http://www.scopus.com/inward/record.url?scp=84871084846&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84871084846&partnerID=8YFLogxK
U2 - 10.1145/2396761.2396843
DO - 10.1145/2396761.2396843
M3 - Conference contribution
AN - SCOPUS:84871084846
SN - 9781450311564
T3 - ACM International Conference Proceeding Series
SP - 635
EP - 644
BT - CIKM 2012 - Proceedings of the 21st ACM International Conference on Information and Knowledge Management
PB - Association for Computing Machinery
T2 - 21st ACM International Conference on Information and Knowledge Management, CIKM 2012
Y2 - 29 October 2012 through 2 November 2012
ER -