TY - GEN
T1 - Contrastive Self-Supervised Representation Learning for Sensing Signals from the Time-Frequency Perspective
AU - Liu, Dongxin
AU - Wang, Tianshi
AU - Liu, Shengzhong
AU - Wang, Ruijie
AU - Yao, Shuochao
AU - Abdelzaher, Tarek
N1 - Funding Information:
Research reported in this paper was sponsored by ARL under Cooperative Agreement W911NF-17-20196.
Publisher Copyright:
© 2021 IEEE.
PY - 2021/7
Y1 - 2021/7
N2 - This paper presents a contrastive self-supervised representation learning framework that is new in being designed specifically for deep learning from frequency domain data. Contrastive self-supervised representation learning trains neural networks using mostly unlabeled data. It is motivated by the need to reduce the labeling burden of deep learning. In this paper, we are specifically interested in applying this approach to physical sensing scenarios, such as those arising in Internet-of-Things (IoT) applications. Deep neural networks have been widely utilized in IoT applications, but the performance of such models largely depends on the availability of large labeled datasets, which in turn entails significant training costs. Motivated by the success of contrastive self-supervised representation learning at substantially reducing the need for labeled data (mostly in areas of computer vision and natural language processing), there is growing interest in customizing the contrastive learning framework to IoT applications. Most existing work in that space approaches the problem from a time-domain perspective. However, IoT applications often measure physical phenomena, where the underlying processes (such as acceleration, vibration, or wireless signal propagation) are fundamentally a function of signal frequencies and thus have sparser and more compact representations in the frequency domain. Recently, this observation motivated the development of Short-Time Fourier Neural Networks (STFNets) that learn directly in the frequency domain, and were shown to offer large performance gains compared to Convolutional Neural Networks (CNNs) when designing supervised learning models for IoT tasks. Hence, in this paper, we introduce an STFNet-based Contrastive Self-supervised representation Learning framework (STF-CSL). STF-CSL takes both time-domain and frequency-domain features into consideration. We build the encoder using STFNet as the fundamental building block. We also apply both time-domain data augmentation and frequency-domain data augmentation during the self-supervised training process. We evaluate the resulting performance of STF-CSL on various human activity recognition tasks. The evaluation results demonstrate that STF-CSL significantly outperforms the time-domain based self-supervised approaches thereby substantially enhancing our ability to train deep neural networks from unlabeled data in IoT contexts.
AB - This paper presents a contrastive self-supervised representation learning framework that is new in being designed specifically for deep learning from frequency domain data. Contrastive self-supervised representation learning trains neural networks using mostly unlabeled data. It is motivated by the need to reduce the labeling burden of deep learning. In this paper, we are specifically interested in applying this approach to physical sensing scenarios, such as those arising in Internet-of-Things (IoT) applications. Deep neural networks have been widely utilized in IoT applications, but the performance of such models largely depends on the availability of large labeled datasets, which in turn entails significant training costs. Motivated by the success of contrastive self-supervised representation learning at substantially reducing the need for labeled data (mostly in areas of computer vision and natural language processing), there is growing interest in customizing the contrastive learning framework to IoT applications. Most existing work in that space approaches the problem from a time-domain perspective. However, IoT applications often measure physical phenomena, where the underlying processes (such as acceleration, vibration, or wireless signal propagation) are fundamentally a function of signal frequencies and thus have sparser and more compact representations in the frequency domain. Recently, this observation motivated the development of Short-Time Fourier Neural Networks (STFNets) that learn directly in the frequency domain, and were shown to offer large performance gains compared to Convolutional Neural Networks (CNNs) when designing supervised learning models for IoT tasks. Hence, in this paper, we introduce an STFNet-based Contrastive Self-supervised representation Learning framework (STF-CSL). STF-CSL takes both time-domain and frequency-domain features into consideration. We build the encoder using STFNet as the fundamental building block. We also apply both time-domain data augmentation and frequency-domain data augmentation during the self-supervised training process. We evaluate the resulting performance of STF-CSL on various human activity recognition tasks. The evaluation results demonstrate that STF-CSL significantly outperforms the time-domain based self-supervised approaches thereby substantially enhancing our ability to train deep neural networks from unlabeled data in IoT contexts.
KW - Frequency Domain
KW - IoT
KW - Representation Learning
KW - Self-supervision
UR - http://www.scopus.com/inward/record.url?scp=85114965657&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85114965657&partnerID=8YFLogxK
U2 - 10.1109/ICCCN52240.2021.9522151
DO - 10.1109/ICCCN52240.2021.9522151
M3 - Conference contribution
AN - SCOPUS:85114965657
T3 - Proceedings - International Conference on Computer Communications and Networks, ICCCN
BT - 30th International Conference on Computer Communications and Networks, ICCCN 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 30th International Conference on Computer Communications and Networks, ICCCN 2021
Y2 - 19 July 2021 through 22 July 2021
ER -