Landmark diffusion maps (L-dMaps): Accelerated manifold learning out-of-sample extension

Andrew W. Long, Andrew L. Ferguson

Research output: Contribution to journalArticle

Abstract

Diffusion maps are a nonlinear manifold learning technique based on harmonic analysis of a diffusion process over the data. Out-of-sample extensions with computational complexity O(N), where N is the number of points comprising the manifold, frustrate applications to online learning applications requiring rapid embedding of high-dimensional data streams. We propose landmark diffusion maps (L-dMaps)to reduce the complexity to O(M), where M≪N is the number of landmark points selected using pruned spanning trees or k-medoids. Offering (N/M)speedups in out-of-sample extension, L-dMaps enable the application of diffusion maps to high-volume and/or high-velocity streaming data. We illustrate our approach on three datasets: the Swiss roll, molecular simulations of a C 24 H 50 polymer chain, and biomolecular simulations of alanine dipeptide. We demonstrate up to 50-fold speedups in out-of-sample extension for the molecular systems with less than 4% errors in manifold reconstruction fidelity relative to calculations over the full dataset.

Original languageEnglish (US)
Pages (from-to)190-211
Number of pages22
JournalApplied and Computational Harmonic Analysis
Volume47
Issue number1
DOIs
StatePublished - Jul 2019

Keywords

  • Diffusion maps
  • Harmonic analysis
  • Molecular simulation
  • Nonlinear dimensionality reduction
  • Spectral graph theory

ASJC Scopus subject areas

  • Applied Mathematics

Fingerprint Dive into the research topics of 'Landmark diffusion maps (L-dMaps): Accelerated manifold learning out-of-sample extension'. Together they form a unique fingerprint.

  • Cite this