TY - JOUR
T1 - Explainable machine learning for hydrogen diffusion in metals and random binary alloys
AU - Lu, Grace M.
AU - Witman, Matthew
AU - Agarwal, Sapan
AU - Stavila, Vitalie
AU - Trinkle, Dallas R.
N1 - This work was supported by the National Science Foundation Graduate Research Fellowship under Grant No. DGE\u20131746047 (G.M.L.). The authors gratefully thank Dr. Rick Karnesky, Josh Vita, and Luke Wirth for the helpful comments and discussions. This work was supported by the Laboratory Directed Research and Development (LDRD) program at Sandia National Laboratories. Sandia National Laboratories is a multimission laboratory managed and operated by National Technology & Engineering Solutions of Sandia, LLC, a wholly owned subsidiary of Honeywell International Inc., for the U.S. Department of Energy's National Nuclear Security Administra- tion (DOE/NNSA) under Contract No. DE-NA0003525.
PY - 2023/10
Y1 - 2023/10
N2 - Hydrogen diffusion in metals and alloys plays an important role in the discovery of new materials for fuel cell and energy storage technology. While analytic models use hand-selected features that have clear physical ties to hydrogen diffusion, they often lack accuracy when making quantitative predictions. Machine learning models are capable of making accurate predictions, but their inner workings are obscured, rendering it unclear which physical features are truly important. To develop interpretable machine learning models to predict the activation energies of hydrogen diffusion in metals and random binary alloys, we create a database for physical and chemical properties of the species and use it to fit six machine learning models. Our models achieve root-mean-squared errors between 98-119 meV on the testing data and accurately predict that elemental Ru has a large activation energy, while elemental Cr and Fe have small activation energies. By analyzing the feature importances of these fitted models, we identify relevant physical properties for predicting hydrogen diffusivity. While metrics for measuring the individual feature importances for machine learning models exist, correlations between the features lead to disagreement between models and limit the conclusions that can be drawn. Instead grouped feature importance, formed by combining the features via their correlations, agree across the six models and reveal that the two groups containing the packing factor and electronic specific heat are particularly significant for predicting hydrogen diffusion in metals and random binary alloys. This framework allows us to interpret machine learning models and enables rapid screening of new materials with the desired rates of hydrogen diffusion.
AB - Hydrogen diffusion in metals and alloys plays an important role in the discovery of new materials for fuel cell and energy storage technology. While analytic models use hand-selected features that have clear physical ties to hydrogen diffusion, they often lack accuracy when making quantitative predictions. Machine learning models are capable of making accurate predictions, but their inner workings are obscured, rendering it unclear which physical features are truly important. To develop interpretable machine learning models to predict the activation energies of hydrogen diffusion in metals and random binary alloys, we create a database for physical and chemical properties of the species and use it to fit six machine learning models. Our models achieve root-mean-squared errors between 98-119 meV on the testing data and accurately predict that elemental Ru has a large activation energy, while elemental Cr and Fe have small activation energies. By analyzing the feature importances of these fitted models, we identify relevant physical properties for predicting hydrogen diffusivity. While metrics for measuring the individual feature importances for machine learning models exist, correlations between the features lead to disagreement between models and limit the conclusions that can be drawn. Instead grouped feature importance, formed by combining the features via their correlations, agree across the six models and reveal that the two groups containing the packing factor and electronic specific heat are particularly significant for predicting hydrogen diffusion in metals and random binary alloys. This framework allows us to interpret machine learning models and enables rapid screening of new materials with the desired rates of hydrogen diffusion.
UR - http://www.scopus.com/inward/record.url?scp=85175401237&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85175401237&partnerID=8YFLogxK
U2 - 10.1103/PhysRevMaterials.7.105402
DO - 10.1103/PhysRevMaterials.7.105402
M3 - Article
AN - SCOPUS:85175401237
SN - 2475-9953
VL - 7
JO - Physical Review Materials
JF - Physical Review Materials
IS - 10
M1 - 105402
ER -