TY - GEN
T1 - The Case for Micro Foundation Models to Support Robust Edge Intelligence
AU - Kimura, Tomoyoshi
AU - Misra, Ashitabh
AU - Chen, Yizhuo
AU - Kara, Denizhan
AU - Li, Jinyang
AU - Wang, Tianshi
AU - Wang, Ruijie
AU - Bhattacharyya, Joydeep
AU - Kim, Jae
AU - Shenoy, Prashant
AU - Srivastava, Mani
AU - Wigness, Maggie
AU - Abdelzaher, Tarek
N1 - Research reported in this paper was sponsored in part by DEVCOM ARL under Cooperative Agreement W911NF-17-2-0196 (ARL IoBT CRA), and in part by NSF CNS 20-38817, and the Boeing Company. It was also supported in part by ACE, one of the seven centers in JUMP 2.0, a Semiconductor Research Corporation (SRC) program sponsored by DARPA. The views and conclusions contained in this document are those of the authors, not the Army Research Laboratory or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation herein.
PY - 2024
Y1 - 2024
N2 - This paper advocates the concept of micro foundation models (µFMs), recently introduced by the authors to describe a category of self-supervised pre-training solutions that we argue are necessary to support robust intelligent inference tasks in Internet of Things (IoT) applications. The work is motivated by the fact that collecting sufficient amounts of labeled data in IoT applications to train AI/ML tasks is challenging due to the difficulties in labeling such data after the fact. In the absence of sufficient labeled data, supervised training solutions become brittle and prone to overfitting. Self-supervised training obviates the collection of labeled data, allowing pre-training with the more readily available unlabeled data instead. Specifically, the µFMs discussed in this paper use self-supervised pre-training to develop an encoder that maps input data into a semantically-organized latent representation in a manner agnostic to the downstream inference task. Our preliminary work shows that this (unsupervised) encoder can be moderately sized, yet produce a latent representation that simultaneously supports the fine-tuning of multiple downstream inference tasks, each at a minimal labeling cost. We demonstrate the efficacy of this pre-training/fine-tuning pipeline using a vibration-based µFM as a running case study. The study shows that the fine-tuning of inference tasks on top of the aforementioned encoder-produced latent representation needs orders of magnitude fewer labels than supervised training solutions, and that the resulting tasks are significantly more robust to environmental changes and easier to adapt to domain shifts compared to their supervised counterparts. Furthermore, we show that inference algorithms based on our example µFM can be executed in real time on a Raspberry Pi device, making the approach viable for the IoT space. We conclude that µFMs are a preferred (and likely necessary) route to support robust intelligent sensing on IoT devices in subareas where labeled data collection is challenging. The paper is a call for the research community to invest in µFM research for IoT applications.
AB - This paper advocates the concept of micro foundation models (µFMs), recently introduced by the authors to describe a category of self-supervised pre-training solutions that we argue are necessary to support robust intelligent inference tasks in Internet of Things (IoT) applications. The work is motivated by the fact that collecting sufficient amounts of labeled data in IoT applications to train AI/ML tasks is challenging due to the difficulties in labeling such data after the fact. In the absence of sufficient labeled data, supervised training solutions become brittle and prone to overfitting. Self-supervised training obviates the collection of labeled data, allowing pre-training with the more readily available unlabeled data instead. Specifically, the µFMs discussed in this paper use self-supervised pre-training to develop an encoder that maps input data into a semantically-organized latent representation in a manner agnostic to the downstream inference task. Our preliminary work shows that this (unsupervised) encoder can be moderately sized, yet produce a latent representation that simultaneously supports the fine-tuning of multiple downstream inference tasks, each at a minimal labeling cost. We demonstrate the efficacy of this pre-training/fine-tuning pipeline using a vibration-based µFM as a running case study. The study shows that the fine-tuning of inference tasks on top of the aforementioned encoder-produced latent representation needs orders of magnitude fewer labels than supervised training solutions, and that the resulting tasks are significantly more robust to environmental changes and easier to adapt to domain shifts compared to their supervised counterparts. Furthermore, we show that inference algorithms based on our example µFM can be executed in real time on a Raspberry Pi device, making the approach viable for the IoT space. We conclude that µFMs are a preferred (and likely necessary) route to support robust intelligent sensing on IoT devices in subareas where labeled data collection is challenging. The paper is a call for the research community to invest in µFM research for IoT applications.
KW - Foundation Models
KW - Internet of Things
KW - Self-Supervised Learning
UR - http://www.scopus.com/inward/record.url?scp=85217397136&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85217397136&partnerID=8YFLogxK
U2 - 10.1109/CogMI62246.2024.00014
DO - 10.1109/CogMI62246.2024.00014
M3 - Conference contribution
AN - SCOPUS:85217397136
T3 - Proceedings - 2024 IEEE 6th International Conference on Cognitive Machine Intelligence, CogMI 2024
SP - 23
EP - 31
BT - Proceedings - 2024 IEEE 6th International Conference on Cognitive Machine Intelligence, CogMI 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 6th IEEE International Conference on Cognitive Machine Intelligence, CogMI 2024
Y2 - 28 October 2024 through 30 October 2024
ER -