Abstract

We propose the approximation-theoretic technique of optimal recovery for imputing missing values in clustered data, specifically for non-negative matrix factorization (NMF), and develop an algorithm for implementation. Under certain geometric conditions, we prove tight upper bounds on NMF relative error, which is the first bound of this type for missing values. Experiments on image data and biological data show that this technique performs as well as or better than other imputation techniques that account for local structure.

Original languageEnglish (US)
Title of host publication2019 IEEE Data Science Workshop, DSW 2019 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages180-184
Number of pages5
ISBN (Electronic)9781728107080
DOIs
StatePublished - Jun 2019
Event2019 IEEE Data Science Workshop, DSW 2019 - Minneapolis, United States
Duration: Jun 2 2019Jun 5 2019

Publication series

Name2019 IEEE Data Science Workshop, DSW 2019 - Proceedings

Conference

Conference2019 IEEE Data Science Workshop, DSW 2019
Country/TerritoryUnited States
CityMinneapolis
Period6/2/196/5/19

Keywords

  • imputation
  • missing values
  • non-negative matrix factorization
  • optimal recovery

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Artificial Intelligence
  • Computer Networks and Communications
  • Safety, Risk, Reliability and Quality

Fingerprint

Dive into the research topics of 'Non-Negative Matrix Factorization of Clustered Data with Missing Values'. Together they form a unique fingerprint.

Cite this