TY - GEN
T1 - Revealing the Power of Masked Autoencoders in Traffic Forecasting
AU - Sun, Jiarui
AU - Fan, Yujie
AU - Yeh, Chin Chia Michael
AU - Zhang, Wei
AU - Chowdhary, Girish
N1 - Publisher Copyright:
© 2024 Owner/Author.
PY - 2024/10/21
Y1 - 2024/10/21
N2 - Traffic forecasting, crucial for urban planning, requires accurate predictions of spatial-temporal traffic patterns across urban areas. Existing research mainly focuses on designing complex spatial-temporal models to capture these dependencies. However, this field faces challenges related to data scarcity and model stability, which results in limited performance improvement. To address these issues, we propose Spatial-Temporal Masked AutoEncoders (STMAE), a plug-and-play framework designed to enhance existing spatial-temporal models on traffic prediction. STMAE operates in two stages. In the pretraining stage, an encoder processes partially visible traffic data produced by a dual-masking strategy, including biased random walk-based spatial masking and patch-based temporal masking. Subsequently, two decoders aim to reconstruct the masked counterparts from both spatial and temporal perspectives. The fine-tuning stage retains the pretrained encoder and integrates it with decoders from existing backbones to improve traffic forecasting accuracy. Our results on traffic benchmarks show that STMAE can largely enhance the forecasting capabilities of various spatial-temporal models.
AB - Traffic forecasting, crucial for urban planning, requires accurate predictions of spatial-temporal traffic patterns across urban areas. Existing research mainly focuses on designing complex spatial-temporal models to capture these dependencies. However, this field faces challenges related to data scarcity and model stability, which results in limited performance improvement. To address these issues, we propose Spatial-Temporal Masked AutoEncoders (STMAE), a plug-and-play framework designed to enhance existing spatial-temporal models on traffic prediction. STMAE operates in two stages. In the pretraining stage, an encoder processes partially visible traffic data produced by a dual-masking strategy, including biased random walk-based spatial masking and patch-based temporal masking. Subsequently, two decoders aim to reconstruct the masked counterparts from both spatial and temporal perspectives. The fine-tuning stage retains the pretrained encoder and integrates it with decoders from existing backbones to improve traffic forecasting accuracy. Our results on traffic benchmarks show that STMAE can largely enhance the forecasting capabilities of various spatial-temporal models.
KW - masked autoencoders
KW - spatial-temporal models
KW - traffic forecasting
UR - http://www.scopus.com/inward/record.url?scp=85210037392&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85210037392&partnerID=8YFLogxK
U2 - 10.1145/3627673.3679989
DO - 10.1145/3627673.3679989
M3 - Conference contribution
AN - SCOPUS:85210037392
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 4071
EP - 4075
BT - CIKM 2024 - Proceedings of the 33rd ACM International Conference on Information and Knowledge Management
PB - Association for Computing Machinery
T2 - 33rd ACM International Conference on Information and Knowledge Management, CIKM 2024
Y2 - 21 October 2024 through 25 October 2024
ER -