TY - GEN
T1 - Online Learning and Planning in Time-Varying Environments
T2 - AIAA SciTech Forum and Exposition, 2024
AU - Puthumanaillam, Gokul
AU - Mamik, Yuvraj
AU - Ornik, Melkior
N1 - Publisher Copyright:
© 2024 by The Board of Trustees of the University of Illinois.
PY - 2024
Y1 - 2024
N2 - Aerospace vehicles routinely encounter uncertain, time-varying, and partially observable environments, presenting considerable challenges for autonomous operation and planning. Traditional learning methods, which excel in static contexts, often falter in such highly dynamic settings. Building on recently established Time-Varying Partially Observable Markov Decision Processes (TV-POMDP) and Memory Prioritized State Estimation (MPSE) methodologies, this work demonstrates their application in the advanced GUAM simulation environment, which models NASA’s Generic UAM concept. The contribution of this paper lies in refining these approaches to suit the complexity and unpredictability of aerospace contexts, where conventional learning strategies are insufficient. By applying MPSE, we enhance the estimation of environmental states with a weighted approach that respects the temporality and informational value of observations. The subsequent policy optimization process is informed by the estimations of these time-varying transition functions, leading to better long-term strategies that are aware of the rapid environmental shifts characteristic of aerospace scenarios. The validation of these methods through the GUAM simulator confirms their effectiveness, marking a positive step towards their practical implementation in autonomous aerospace vehicles that encounter continual, stochastic changes.
AB - Aerospace vehicles routinely encounter uncertain, time-varying, and partially observable environments, presenting considerable challenges for autonomous operation and planning. Traditional learning methods, which excel in static contexts, often falter in such highly dynamic settings. Building on recently established Time-Varying Partially Observable Markov Decision Processes (TV-POMDP) and Memory Prioritized State Estimation (MPSE) methodologies, this work demonstrates their application in the advanced GUAM simulation environment, which models NASA’s Generic UAM concept. The contribution of this paper lies in refining these approaches to suit the complexity and unpredictability of aerospace contexts, where conventional learning strategies are insufficient. By applying MPSE, we enhance the estimation of environmental states with a weighted approach that respects the temporality and informational value of observations. The subsequent policy optimization process is informed by the estimations of these time-varying transition functions, leading to better long-term strategies that are aware of the rapid environmental shifts characteristic of aerospace scenarios. The validation of these methods through the GUAM simulator confirms their effectiveness, marking a positive step towards their practical implementation in autonomous aerospace vehicles that encounter continual, stochastic changes.
UR - http://www.scopus.com/inward/record.url?scp=85191327317&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85191327317&partnerID=8YFLogxK
U2 - 10.2514/6.2024-0109
DO - 10.2514/6.2024-0109
M3 - Conference contribution
AN - SCOPUS:85191327317
SN - 9781624107115
T3 - AIAA SciTech Forum and Exposition, 2024
BT - AIAA SciTech Forum and Exposition, 2024
PB - American Institute of Aeronautics and Astronautics Inc, AIAA
Y2 - 8 January 2024 through 12 January 2024
ER -