High frequency residual learning for multi-scale image classification

Bowen Cheng, Rong Xiao, Jianfeng Wang, Thomas Huang, Lei Zhang

Research output: Contribution to conferencePaperpeer-review


We present a novel high frequency residual learning framework, which leads to a highly efficient multi-scale network (MSNet) architecture for mobile and embedded vision problems. The architecture utilizes two networks: a low resolution network to efficiently approximate low frequency components and a high resolution network to learn high frequency residuals by reusing the upsampled low resolution features. With a classifier calibration module, MSNet can dynamically allocate computation resources during inference to achieve a better speed and accuracy trade-off. We evaluate our methods on the challenging ImageNet-1k dataset and observe consistent improvements over different base networks. On ResNet-18 and MobileNet with a = 1.0, MSNet gains 1.5% accuracy over both architectures without increasing computations. On the more efficient MobileNet with a = 0.25, our method gains 3.8% accuracy with the same amount of computations.

Original languageEnglish (US)
StatePublished - 2020
Event30th British Machine Vision Conference, BMVC 2019 - Cardiff, United Kingdom
Duration: Sep 9 2019Sep 12 2019


Conference30th British Machine Vision Conference, BMVC 2019
Country/TerritoryUnited Kingdom

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition


Dive into the research topics of 'High frequency residual learning for multi-scale image classification'. Together they form a unique fingerprint.

Cite this