TY - GEN
T1 - MOOSS
T2 - 2025 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2025
AU - Sun, Jiarui
AU - Akcal, M. Ugur
AU - Chowdhary, Girish
AU - Zhang, Wei
N1 - This work is supported in part by Navy N00014-19-1-2373, the joint NSF-USDA CPS Frontier project CNS #1954556, USDA-NIFA #2021-67021-34418, and Agriculture and Food Research Initiative (AFRI) grant no. 2020-67021-32799/project accession no.1024178 from the USDA National Institute of Food and Agriculture: NSF/USDA National AI Institute: AIFARMS. Work is supported in part by NSF MRI grant #1725729 [28]. Work also used Delta GPU at NCSA Delta through allocation CIS230331 from the Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support (ACCESS) program [6], which is supported by NSF grants #2138259, #2138286, #2138307, #2137603, and #2138296.
PY - 2025
Y1 - 2025
N2 - In visual Reinforcement Learning (RL), learning from pixel-based observations poses significant challenges on sample efficiency, primarily due to the complexity of extracting informative state representations from high-dimensional data. Previous methods such as contrastive-based approaches have made strides in improving sample efficiency but fall short in modeling the nuanced evolution of states. To address this, we introduce MOOSS, a novelframe-work that leverages a temporal contrastive objective with the help of graph-based spatial-temporal masking to explicitly model state evolution in visual RL. Specifically, we propose a self-supervised dual-component strategy that integrates (1) a graph construction of pixel-based observations for spatial-temporal masking, coupled with (2) a multilevel contrastive learning mechanism that enriches state representations by emphasizing temporal continuity and change of states. MOOSS advances the understanding of state dynamics by disrupting and learning from spatial-temporal correlations, which facilitates policy learning. Our comprehensive evaluation on multiple continuous and discrete control benchmarks shows that MOOSS outperforms previous state-of-the-art visual RL methods in terms of sample efficiency, demonstrating the effectiveness of our method.
AB - In visual Reinforcement Learning (RL), learning from pixel-based observations poses significant challenges on sample efficiency, primarily due to the complexity of extracting informative state representations from high-dimensional data. Previous methods such as contrastive-based approaches have made strides in improving sample efficiency but fall short in modeling the nuanced evolution of states. To address this, we introduce MOOSS, a novelframe-work that leverages a temporal contrastive objective with the help of graph-based spatial-temporal masking to explicitly model state evolution in visual RL. Specifically, we propose a self-supervised dual-component strategy that integrates (1) a graph construction of pixel-based observations for spatial-temporal masking, coupled with (2) a multilevel contrastive learning mechanism that enriches state representations by emphasizing temporal continuity and change of states. MOOSS advances the understanding of state dynamics by disrupting and learning from spatial-temporal correlations, which facilitates policy learning. Our comprehensive evaluation on multiple continuous and discrete control benchmarks shows that MOOSS outperforms previous state-of-the-art visual RL methods in terms of sample efficiency, demonstrating the effectiveness of our method.
UR - http://www.scopus.com/inward/record.url?scp=105003633234&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=105003633234&partnerID=8YFLogxK
U2 - 10.1109/WACV61041.2025.00654
DO - 10.1109/WACV61041.2025.00654
M3 - Conference contribution
AN - SCOPUS:105003633234
T3 - Proceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025
SP - 6719
EP - 6729
BT - Proceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 28 February 2025 through 4 March 2025
ER -