Sparse hierarchical tucker factorization and its application to healthcare

Ioakeim Perros, Robert Chen, Richard Vuduc, Jimeng Sun

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We propose a new tensor factorization method, called the Sparse Hierarchical-Tucker (Sparse H-Tucker), for sparse and high-order data tensors. Sparse H-Tucker is inspired by its namesake, the classical Hierarchical Tucker method, which aims to compute a tree-structured factorization of an input data set that may be readily interpreted by a domain expert. However, Sparse H-Tucker uses a nested sampling technique to overcome a key scalability problem in Hierarchical Tucker, which is the creation of an unwieldy intermediate dense core tensor, the result of our approach is a faster, more space-efficient, and more accurate method. We test our method on a real healthcare dataset, which is collected from 30K patients and results in an 18th order sparse data tensor. Unlike competing methods, Sparse H-Tucker can analyze the full data set on a single multi-threaded machine. It can also do so more accurately and in less time than the state-of-the-art: on a 12th order subset of the input data, Sparse H-Tucker is 18x more accurate and 7.5x faster than a previously state-of-the-art method. Moreover, we observe that Sparse H-Tucker scales nearly linearly in the number of non-zero tensor elements. The resulting model also provides an interpretable disease hierarchy, which is confirmed by a clinical expert.

Original languageEnglish (US)
Title of host publicationProceedings - 15th IEEE International Conference on Data Mining, ICDM 2015
EditorsCharu Aggarwal, Zhi-Hua Zhou, Alexander Tuzhilin, Hui Xiong, Xindong Wu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages943-948
Number of pages6
ISBN (Electronic)9781467395038
DOIs
StatePublished - Jan 5 2016
Externally publishedYes
Event15th IEEE International Conference on Data Mining, ICDM 2015 - Atlantic City, United States
Duration: Nov 14 2015Nov 17 2015

Publication series

NameProceedings - IEEE International Conference on Data Mining, ICDM
Volume2016-January
ISSN (Print)1550-4786

Other

Other15th IEEE International Conference on Data Mining, ICDM 2015
Country/TerritoryUnited States
CityAtlantic City
Period11/14/1511/17/15

ASJC Scopus subject areas

  • General Engineering

Fingerprint

Dive into the research topics of 'Sparse hierarchical tucker factorization and its application to healthcare'. Together they form a unique fingerprint.

Cite this