TY - JOUR
T1 - Forget-free Continual Learning with Winning Subnetworks
AU - Kang, Haeyong
AU - Mina, Rusty John Lloyd
AU - Madjid, Sultan Rizky Hikmawan
AU - Yoon, Jaehong
AU - Hasegawa-Johnson, Mark
AU - Hwang, Sung Ju
AU - Yoo, Chang D.
N1 - Acknowledgements. This work was partly supported by Institute for Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (2021-0-01381) and partly supported by the IITP grant (2022-0-00184).
This work was partly supported by Institute for Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (2021-0-01381) and partly supported by the IITP grant (2022-0-00184).
PY - 2022
Y1 - 2022
N2 - Inspired by Lottery Ticket Hypothesis that competitive subnetworks exist within a dense network, we propose a continual learning method referred to as Winning SubNetworks (WSN) which sequentially learns and selects an optimal subnetwork for each task. Specifically, WSN jointly learns the model weights and task-adaptive binary masks pertaining to subnetworks associated with each task whilst attempting to select a small set of weights to be activated (winning ticket) by reusing weights of the prior subnetworks. The proposed method is inherently immune to catastrophic forgetting as each selected subnetwork model does not infringe upon other subnetworks. Binary masks spawned per winning ticket are encoded into one N-bit binary digit mask, then compressed using Huffman coding for a sub-linear increase in network capacity with respect to the number of tasks. Code is available at https://github.com/ihaeyong/WSN.
AB - Inspired by Lottery Ticket Hypothesis that competitive subnetworks exist within a dense network, we propose a continual learning method referred to as Winning SubNetworks (WSN) which sequentially learns and selects an optimal subnetwork for each task. Specifically, WSN jointly learns the model weights and task-adaptive binary masks pertaining to subnetworks associated with each task whilst attempting to select a small set of weights to be activated (winning ticket) by reusing weights of the prior subnetworks. The proposed method is inherently immune to catastrophic forgetting as each selected subnetwork model does not infringe upon other subnetworks. Binary masks spawned per winning ticket are encoded into one N-bit binary digit mask, then compressed using Huffman coding for a sub-linear increase in network capacity with respect to the number of tasks. Code is available at https://github.com/ihaeyong/WSN.
UR - http://www.scopus.com/inward/record.url?scp=85144814508&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85144814508&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85144814508
SN - 2640-3498
VL - 162
SP - 10734
EP - 10750
JO - Proceedings of Machine Learning Research
JF - Proceedings of Machine Learning Research
T2 - 39th International Conference on Machine Learning, ICML 2022
Y2 - 17 July 2022 through 23 July 2022
ER -