Extracting phenotypes from patient claim records using nonnegative tensor factorization

Joyce C. Ho, Joydeep Ghosh, Jimeng Sun

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Electronic health records (EHRs) are becoming an increasingly important source of patient information. Unfortunately, EHR data do not always directly and reliably map to medical concepts that clinical researchers need or use. Some recent studies have focused on EHR-derived phenotyping, which aims at mapping the EHR data to specific medical concepts; however, most of these approaches require labor intensive supervision from experienced clinical professionals. In this paper, we use Limestone, a nonnegative tensor factorization method to derive phenotype candidates from claims data with virtually no human supervision. Limestone represents the interactions between diagnoses and procedures among patients naturally using tensors (a generalization of matrices). The resulting tensor factors are reported as phenotype candidates that automatically reveal patient clusters on specific diagnoses and procedures. To the best of our knowledge, this is the first study that successfully extracts useful phenotypes by applying sparse nonnegative tensor factorization to a large, public-domain EHR dataset covering a broad range of diseases. Our experiments demonstrate the interpretability and the promise of high-throughput phenotypes generated from tensor factorization.

Original languageEnglish (US)
Title of host publicationBrain Informatics and Health - International Conference, BIH 2014, Proceedings
PublisherSpringer
Pages142-151
Number of pages10
ISBN (Print)9783319098906
DOIs
StatePublished - 2014
Externally publishedYes
Event2014 International Conference on Brain Informatics and Health, BIH 2014 - Warsaw, Poland
Duration: Aug 11 2014Aug 14 2014

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8609 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference2014 International Conference on Brain Informatics and Health, BIH 2014
Country/TerritoryPoland
CityWarsaw
Period8/11/148/14/14

Keywords

  • dimensionality reduction
  • EHR phenotyping
  • tensor factorization

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Extracting phenotypes from patient claim records using nonnegative tensor factorization'. Together they form a unique fingerprint.

Cite this