A data-driven approach to develop physically sound predictors: Application to depth-averaged velocities on flows through submerged arrays of rigid cylinders

R. O. Tinoco, E. B. Goldstein, G. Coco

Research output: Contribution to journalArticle


We use a machine learning approach to seek an accurate, physically sound predictor, to estimate the mean velocity for open-channel flow when submerged arrays of rigid cylinders (model vegetation) are present. A genetic programming routine is used to find a robust relationship between relevant properties of the model vegetation and flow parameters. We use published data from laboratory experiments covering a broad range of conditions to obtain an equation that matches the performance of other predictors from recent literature in terms of accuracy, while showing a less complex structure. We also investigate how different criteria for data selection, as well as the size of the data set used to train the algorithm, influences the accuracy of the resulting predictors. Our results show that a proper use of Machine-Learning techniques does not only provide empirical correlations, but can yield physically sound models as representative of the physical processes involved. We provide a clear, thorough example of the application of GP, its advantages and shortcomings, to encourage the use of data-driven techniques as part of the data analysis process, and to address common misconceptions of machine learning as simple correlation techniques or physically senseless statistical analysis.

Original languageEnglish (US)
Pages (from-to)1247-1263
Number of pages17
JournalWater Resources Research
Issue number2
StatePublished - Feb 2015



  • genetic programming
  • machine learning
  • open-channel flow
  • vegetation resistance

ASJC Scopus subject areas

  • Water Science and Technology

Cite this