TY - JOUR
T1 - Modeling and recognition of landmark image collections using iconic scene graphs
AU - Raguram, Rahul
AU - Wu, Changchang
AU - Frahm, Jan Michael
AU - Lazebnik, Svetlana
N1 - Funding Information:
Acknowledgements This research was supported in part by DARPA ASSIST program, NSF grants IIS-0916829, IIS-0845629, and CNS-0751187, and other funding from the U.S. government. Svetlana Lazebnik was supported by the Microsoft Research Faculty Fellowship. We would also like to thank our collaborators Marc Pollefeys, Xiaowei Li, Christopher Zach, and Tim Johnson.
PY - 2011/12
Y1 - 2011/12
N2 - This article presents an approach for modeling landmarks based on large-scale, heavily contaminated image collections gathered from the Internet. Our system efficiently combines 2D appearance and 3D geometric constraints to extract scene summaries and construct 3D models. In the first stage of processing, images are clustered based on low-dimensional global appearance descriptors, and the clusters are refined using 3D geometric constraints. Each valid cluster is represented by a single iconic view, and the geometric relationships between iconic views are captured by an iconic scene graph. Using structure from motion techniques, the system then registers the iconic images to efficiently produce 3D models of the different aspects of the landmark. To improve coverage of the scene, these 3D models are subsequently extended using additional, non-iconic views. We also demonstrate the use of iconic images for recognition and browsing. Our experimental results demonstrate the ability to process datasets containing up to 46,000 images in less than 20 hours, using a single commodity PC equipped with a graphics card. This is a significant advance towards Internet-scale operation.
AB - This article presents an approach for modeling landmarks based on large-scale, heavily contaminated image collections gathered from the Internet. Our system efficiently combines 2D appearance and 3D geometric constraints to extract scene summaries and construct 3D models. In the first stage of processing, images are clustered based on low-dimensional global appearance descriptors, and the clusters are refined using 3D geometric constraints. Each valid cluster is represented by a single iconic view, and the geometric relationships between iconic views are captured by an iconic scene graph. Using structure from motion techniques, the system then registers the iconic images to efficiently produce 3D models of the different aspects of the landmark. To improve coverage of the scene, these 3D models are subsequently extended using additional, non-iconic views. We also demonstrate the use of iconic images for recognition and browsing. Our experimental results demonstrate the ability to process datasets containing up to 46,000 images in less than 20 hours, using a single commodity PC equipped with a graphics card. This is a significant advance towards Internet-scale operation.
KW - Image clustering
KW - Landmark recognition
KW - Landmark reconstruction
KW - Location recognition
KW - Photo collection reconstruction
KW - Structure from motion
UR - http://www.scopus.com/inward/record.url?scp=80052947789&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=80052947789&partnerID=8YFLogxK
U2 - 10.1007/s11263-011-0445-z
DO - 10.1007/s11263-011-0445-z
M3 - Article
AN - SCOPUS:80052947789
SN - 0920-5691
VL - 95
SP - 213
EP - 239
JO - International Journal of Computer Vision
JF - International Journal of Computer Vision
IS - 3
ER -