AutoScaler: Scale-attention Networks for Visual Correspondence

Shenlong Wang, Linjie Luo, Ning Zhang, Jia Li

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Finding visual correspondence between local features is key to many computer vision problems. While defining features with larger contextual scales usually implies greater discriminativeness, it could also lead to less spatial accuracy of the features. We propose AutoScaler, a scale-attention network to explicitly optimize this trade-off in visual correspondence tasks. Our architecture consists of a weight-sharing feature network to compute multi-scale feature maps and an attention network to combine them optimally in the scale space. This allows our network to have adaptive sizes of equivalent receptive field over different scales of the input. The entire network can be trained end-to-end in a Siamese framework for visual correspondence tasks. Using the latest off-the-shelf architecture for the feature network, our method achieves competitive results compared to state-of-the-art methods on challenging optical flow and semantic matching benchmarks, including Sintel, KITTI and CUB-2011. We also show that our attention network alone can be applied to existing hand-crafted feature descriptors (e.g Daisy) and improve their performance on visual correspondence tasks. Finally, we illustrate how the scale-attention maps generated from the attention network are visually interpretable.

Original languageEnglish (US)
Title of host publicationBritish Machine Vision Conference 2017, BMVC 2017
PublisherBMVA Press
ISBN (Electronic)190172560X, 9781901725605
StatePublished - 2017
Externally publishedYes
Event28th British Machine Vision Conference, BMVC 2017 - London, United Kingdom
Duration: Sep 4 2017Sep 7 2017

Publication series

NameBritish Machine Vision Conference 2017, BMVC 2017


Conference28th British Machine Vision Conference, BMVC 2017
Country/TerritoryUnited Kingdom

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition


Dive into the research topics of 'AutoScaler: Scale-attention Networks for Visual Correspondence'. Together they form a unique fingerprint.

Cite this