Data-driven Markov linear models of nonlinear fluid flows using maps of the state into a sparse feature space are explored in this article. The underlying principle of low-order models for fluid systems is identifying maps to a feature space where the system evolution (a) is simpler and efficient to model accurately and (b) the state can be recovered accurately from the features through inverse mapping. Such methods are useful when real-time models are needed for online decision making from sensor data. The Markov linear approximation is popular as it allows us to leverage the well established linear systems machinery. Examples include the Koopman operator approximation techniques and evolutionary kernel methods in machine learning. The success of these models in approximating nonlinear dynamical systems is tied to the effectiveness of the feature map in accomplishing both (a) and (b) above as long as the system provides a feasible prediction horizon using data. We assess this by performing an in-depth study of two different classes of sparse linear feature transformations of the state: (i) a pure data-driven POD-based projection that uses left singular vectors of the data snapshots – a staple of common Koopman approximation methods such as Dynamic Mode Decomposition (DMD) and its variants such as extended DMD; and (ii) a partially data-driven sparse Gaussian kernel (sGK) regression (a mean sparse Gaussian Process (sGP) predictor). The sGK/sGP regression equivalently represents a projection onto an infinite-dimensional basis characterized by a kernel in the inner product reproducing kernel Hilbert space (RKHS). We are particularly interested in the effectiveness of these linear feature maps for long-term prediction using limited data for three classes of fluid flows with escalating complexity (and decreasing prediction horizons) starting from a limit-cycle attractor in a cylinder wake followed by a transient wake evolution with a shift in the base flow and finally, a continuously evolving buoyant Boussinesq mixing flow with no well-defined base state. The results indicate that a purely data-driven POD map is good for full state reconstruction as long as the basis remains relevant to the predictions whereas the more generic sparse Gaussian Kernel (sGK) basis is less sensitive to the evolution of the dynamics but prone to reconstruction errors from lack of parsimony. Contrastingly, the sGK-maps outperform POD-based maps in learning the transient nonlinear evolution of the state for the same feature dimension in systems that contain a well-defined attractor(s). Consequently, both POD and sGK-maps require additional layer(s) to help mitigate these shortcomings. For example, POD-maps require nonlinear functional extensions for improved feature space predictions whereas sGK-maps require dimensionality reduction to balance the large feature dimension needed for accurate full state reconstruction. However, both classes of multilayer feature maps fail to predict the highly evolving buoyant mixing flow for very different reasons.

Original languageEnglish (US)
Article number104252
JournalComputers and Fluids
StatePublished - Sep 15 2019


  • Data-driven
  • Feature map
  • Gaussian processes
  • Markovian
  • POD
  • Radial basis function kernels
  • Sparse

ASJC Scopus subject areas

  • General Computer Science
  • General Engineering


Dive into the research topics of 'Sparse feature map-based Markov models for nonlinear fluid flows'. Together they form a unique fingerprint.

Cite this