TY - JOUR
T1 - The three R's of computer vision
T2 - Recognition, reconstruction and reorganization
AU - Malik, Jitendra
AU - Arbeláez, Pablo
AU - Carreira, João
AU - Fragkiadaki, Katerina
AU - Girshick, Ross
AU - Gkioxari, Georgia
AU - Gupta, Saurabh
AU - Hariharan, Bharath
AU - Kar, Abhishek
AU - Tulsiani, Shubham
N1 - Publisher Copyright:
© 2016 Published by Elsevier B.V.
PY - 2016/3/1
Y1 - 2016/3/1
N2 - We argue for the importance of the interaction between recognition, reconstruction and re-organization, and propose that as a unifying framework for computer vision. In this view, recognition of objects is reciprocally linked to re-organization, with bottom-up grouping processes generating candidates, which can be classified using top down knowledge, following which the segmentations can be refined again. Recognition of 3D objects could benefit from a reconstruction of 3D structure, and 3D reconstruction can benefit from object category-specific priors. We also show that reconstruction of 3D structure from video data goes hand in hand with the reorganization of the scene. We demonstrate pipelined versions of two systems, one for RGB-D images, and another for RGB images, which produce rich 3D scene interpretations in this framework.
AB - We argue for the importance of the interaction between recognition, reconstruction and re-organization, and propose that as a unifying framework for computer vision. In this view, recognition of objects is reciprocally linked to re-organization, with bottom-up grouping processes generating candidates, which can be classified using top down knowledge, following which the segmentations can be refined again. Recognition of 3D objects could benefit from a reconstruction of 3D structure, and 3D reconstruction can benefit from object category-specific priors. We also show that reconstruction of 3D structure from video data goes hand in hand with the reorganization of the scene. We demonstrate pipelined versions of two systems, one for RGB-D images, and another for RGB images, which produce rich 3D scene interpretations in this framework.
KW - 3D models
KW - Action recognition: grouping
KW - Object recognition
KW - Segmentation
KW - Shape reconstruction
UR - http://www.scopus.com/inward/record.url?scp=84961138317&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84961138317&partnerID=8YFLogxK
U2 - 10.1016/j.patrec.2016.01.019
DO - 10.1016/j.patrec.2016.01.019
M3 - Article
AN - SCOPUS:84961138317
SN - 0167-8655
VL - 72
SP - 4
EP - 14
JO - Pattern Recognition Letters
JF - Pattern Recognition Letters
ER -