TY - JOUR
T1 - Improving Stream Solute Predictions With a Modified LSTM Model Incorporating Solute Interdependences and Hysteresis Patterns
AU - Agrawal, Tarun
AU - Goodwell, Allison
AU - Kumar, Praveen
N1 - National Science Foundation. Grant Numbers: OSF 1835834, EAR 2012850
PY - 2025/3
Y1 - 2025/3
N2 - Surface runoff and infiltrated water en route to the stream interact with dynamic landscape properties, ranging from vegetation and microbial activities to soil and geological attributes. Stream solute concentrations are highly variable and interconnected due to these interactions, flow paths, and residence times, and often exhibit hysteresis with flow. Significant unknowns remain about how point measurements of stream solute chemistry reflect interdependent hydrobiogeochemical and physical processes, and how signatures are encapsulated as nonlinear dynamical relationships between variables. We take a Machine Learning (ML) approach to understand and capture these dynamical relationships and improve predictions of solutes at short and long time scales. We introduce a physical process-based “flow-gate” into an Long Short-Term Memory (LSTM) model, which enables the model to learn hysteresis behaviors if they exist. Further, we use information-theoretic metrics to detect how solutes are interdependent and iteratively select source solutes that best predict a given target solute concentration. The “flow-gate LSTM” model improves model predictions (1%–32% decreases in RMSE) relative to the standard LSTM model for all nine solutes included in the study. The predictive improvements from the flow-gate LSTM model highlight the importance of lagged concentration and discharge relationships for certain solutes. It also indicates a potential limitation in the traditional LSTM model approach since flow rates are always provided as input sources, but this information is not fully utilized. This work provides a starting point for a predictive understanding of geochemical interdependencies using machine-learning approaches and highlights potential improvements in model architecture.
AB - Surface runoff and infiltrated water en route to the stream interact with dynamic landscape properties, ranging from vegetation and microbial activities to soil and geological attributes. Stream solute concentrations are highly variable and interconnected due to these interactions, flow paths, and residence times, and often exhibit hysteresis with flow. Significant unknowns remain about how point measurements of stream solute chemistry reflect interdependent hydrobiogeochemical and physical processes, and how signatures are encapsulated as nonlinear dynamical relationships between variables. We take a Machine Learning (ML) approach to understand and capture these dynamical relationships and improve predictions of solutes at short and long time scales. We introduce a physical process-based “flow-gate” into an Long Short-Term Memory (LSTM) model, which enables the model to learn hysteresis behaviors if they exist. Further, we use information-theoretic metrics to detect how solutes are interdependent and iteratively select source solutes that best predict a given target solute concentration. The “flow-gate LSTM” model improves model predictions (1%–32% decreases in RMSE) relative to the standard LSTM model for all nine solutes included in the study. The predictive improvements from the flow-gate LSTM model highlight the importance of lagged concentration and discharge relationships for certain solutes. It also indicates a potential limitation in the traditional LSTM model approach since flow rates are always provided as input sources, but this information is not fully utilized. This work provides a starting point for a predictive understanding of geochemical interdependencies using machine-learning approaches and highlights potential improvements in model architecture.
U2 - 10.1029/2024JH000383
DO - 10.1029/2024JH000383
M3 - Article
SN - 2993-5210
VL - 2
JO - Journal of Geophysical Research: Machine Learning and Computation
JF - Journal of Geophysical Research: Machine Learning and Computation
IS - 1
M1 - e2024JH000383
ER -