Landmark diffusion maps (L-dMaps): Accelerated manifold learning out-of-sample extension

Andrew W. Long, Andrew L. Ferguson

Research output: Contribution to journalArticle

Abstract

Diffusion maps are a nonlinear manifold learning technique based on harmonic analysis of a diffusion process over the data. Out-of-sample extensions with computational complexity O(N), where N is the number of points comprising the manifold, frustrate applications to online learning applications requiring rapid embedding of high-dimensional data streams. We propose landmark diffusion maps (L-dMaps)to reduce the complexity to O(M), where M≪N is the number of landmark points selected using pruned spanning trees or k-medoids. Offering (N/M)speedups in out-of-sample extension, L-dMaps enable the application of diffusion maps to high-volume and/or high-velocity streaming data. We illustrate our approach on three datasets: the Swiss roll, molecular simulations of a C 24 H 50 polymer chain, and biomolecular simulations of alanine dipeptide. We demonstrate up to 50-fold speedups in out-of-sample extension for the molecular systems with less than 4% errors in manifold reconstruction fidelity relative to calculations over the full dataset.

Original languageEnglish (US)
Pages (from-to)190-211
Number of pages22
JournalApplied and Computational Harmonic Analysis
Volume47
Issue number1
DOIs
StatePublished - Jul 2019

Fingerprint

Manifold Learning
Landmarks
Streaming Data
Molecular Simulation
Online Learning
Harmonic Analysis
High-dimensional Data
Data Streams
Spanning tree
Harmonic analysis
Diffusion Process
Fidelity
Computational Complexity
Fold
Polymers
Computational complexity
Demonstrate
Simulation

Keywords

  • Diffusion maps
  • Harmonic analysis
  • Molecular simulation
  • Nonlinear dimensionality reduction
  • Spectral graph theory

ASJC Scopus subject areas

  • Applied Mathematics

Cite this

Landmark diffusion maps (L-dMaps) : Accelerated manifold learning out-of-sample extension. / Long, Andrew W.; Ferguson, Andrew L.

In: Applied and Computational Harmonic Analysis, Vol. 47, No. 1, 07.2019, p. 190-211.

Research output: Contribution to journalArticle

@article{3c3874e987194bf2b18089c7aca83fc5,
title = "Landmark diffusion maps (L-dMaps): Accelerated manifold learning out-of-sample extension",
abstract = "Diffusion maps are a nonlinear manifold learning technique based on harmonic analysis of a diffusion process over the data. Out-of-sample extensions with computational complexity O(N), where N is the number of points comprising the manifold, frustrate applications to online learning applications requiring rapid embedding of high-dimensional data streams. We propose landmark diffusion maps (L-dMaps)to reduce the complexity to O(M), where M≪N is the number of landmark points selected using pruned spanning trees or k-medoids. Offering (N/M)speedups in out-of-sample extension, L-dMaps enable the application of diffusion maps to high-volume and/or high-velocity streaming data. We illustrate our approach on three datasets: the Swiss roll, molecular simulations of a C 24 H 50 polymer chain, and biomolecular simulations of alanine dipeptide. We demonstrate up to 50-fold speedups in out-of-sample extension for the molecular systems with less than 4{\%} errors in manifold reconstruction fidelity relative to calculations over the full dataset.",
keywords = "Diffusion maps, Harmonic analysis, Molecular simulation, Nonlinear dimensionality reduction, Spectral graph theory",
author = "Long, {Andrew W.} and Ferguson, {Andrew L.}",
year = "2019",
month = "7",
doi = "10.1016/j.acha.2017.08.004",
language = "English (US)",
volume = "47",
pages = "190--211",
journal = "Applied and Computational Harmonic Analysis",
issn = "1063-5203",
publisher = "Academic Press Inc.",
number = "1",

}

TY - JOUR

T1 - Landmark diffusion maps (L-dMaps)

T2 - Accelerated manifold learning out-of-sample extension

AU - Long, Andrew W.

AU - Ferguson, Andrew L.

PY - 2019/7

Y1 - 2019/7

N2 - Diffusion maps are a nonlinear manifold learning technique based on harmonic analysis of a diffusion process over the data. Out-of-sample extensions with computational complexity O(N), where N is the number of points comprising the manifold, frustrate applications to online learning applications requiring rapid embedding of high-dimensional data streams. We propose landmark diffusion maps (L-dMaps)to reduce the complexity to O(M), where M≪N is the number of landmark points selected using pruned spanning trees or k-medoids. Offering (N/M)speedups in out-of-sample extension, L-dMaps enable the application of diffusion maps to high-volume and/or high-velocity streaming data. We illustrate our approach on three datasets: the Swiss roll, molecular simulations of a C 24 H 50 polymer chain, and biomolecular simulations of alanine dipeptide. We demonstrate up to 50-fold speedups in out-of-sample extension for the molecular systems with less than 4% errors in manifold reconstruction fidelity relative to calculations over the full dataset.

AB - Diffusion maps are a nonlinear manifold learning technique based on harmonic analysis of a diffusion process over the data. Out-of-sample extensions with computational complexity O(N), where N is the number of points comprising the manifold, frustrate applications to online learning applications requiring rapid embedding of high-dimensional data streams. We propose landmark diffusion maps (L-dMaps)to reduce the complexity to O(M), where M≪N is the number of landmark points selected using pruned spanning trees or k-medoids. Offering (N/M)speedups in out-of-sample extension, L-dMaps enable the application of diffusion maps to high-volume and/or high-velocity streaming data. We illustrate our approach on three datasets: the Swiss roll, molecular simulations of a C 24 H 50 polymer chain, and biomolecular simulations of alanine dipeptide. We demonstrate up to 50-fold speedups in out-of-sample extension for the molecular systems with less than 4% errors in manifold reconstruction fidelity relative to calculations over the full dataset.

KW - Diffusion maps

KW - Harmonic analysis

KW - Molecular simulation

KW - Nonlinear dimensionality reduction

KW - Spectral graph theory

UR - http://www.scopus.com/inward/record.url?scp=85028622354&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85028622354&partnerID=8YFLogxK

U2 - 10.1016/j.acha.2017.08.004

DO - 10.1016/j.acha.2017.08.004

M3 - Article

AN - SCOPUS:85028622354

VL - 47

SP - 190

EP - 211

JO - Applied and Computational Harmonic Analysis

JF - Applied and Computational Harmonic Analysis

SN - 1063-5203

IS - 1

ER -