TY - GEN
T1 - Towards open-universe image parsing with broad coverage
AU - Tighe, Joseph
AU - Lazebnik, Svetlana
N1 - Publisher Copyright:
© 2013; MVA Organization. All rights reserved.
PY - 2013
Y1 - 2013
N2 - This paper presents an overview of our work on image parsing, which we define as the problem of labeling each pixel in an image with its semantic category. Our aim is to achieve broad coverage across hundreds of object categories, many of them sparsely sampled. We first describe our baseline nonparametric region-based parsing system. This approach is based on lazy learning, and it can easily scale to datasets with tens of thousands of images and hundreds of labels. We then present three extensions to this baseline system. First, we simultaneously label each region as a semantic class (e.g., tree, building, car) and geometric class (sky, vertical, ground) while enforcing coherence between the two label types (roads can’t be labeled as vertical). Second, we extend this simultaneous labeling to an arbitrary number of label types. For example, we may want to simultaneously label every image region according to its basic-level object category (car, building, road, tree, etc.), superordinate category (animal, vehicle, manmade object, natural object, etc.), geometric orientation (horizontal, vertical, etc.), and material (metal, glass, wood, etc.). Finally, we present a hybrid parsing system that combines our region-based system with per-exemplar sliding window detectors to improve parsing performance on small object classes, giving broader coverage.
AB - This paper presents an overview of our work on image parsing, which we define as the problem of labeling each pixel in an image with its semantic category. Our aim is to achieve broad coverage across hundreds of object categories, many of them sparsely sampled. We first describe our baseline nonparametric region-based parsing system. This approach is based on lazy learning, and it can easily scale to datasets with tens of thousands of images and hundreds of labels. We then present three extensions to this baseline system. First, we simultaneously label each region as a semantic class (e.g., tree, building, car) and geometric class (sky, vertical, ground) while enforcing coherence between the two label types (roads can’t be labeled as vertical). Second, we extend this simultaneous labeling to an arbitrary number of label types. For example, we may want to simultaneously label every image region according to its basic-level object category (car, building, road, tree, etc.), superordinate category (animal, vehicle, manmade object, natural object, etc.), geometric orientation (horizontal, vertical, etc.), and material (metal, glass, wood, etc.). Finally, we present a hybrid parsing system that combines our region-based system with per-exemplar sliding window detectors to improve parsing performance on small object classes, giving broader coverage.
UR - http://www.scopus.com/inward/record.url?scp=85083083961&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85083083961&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85083083961
SN - 9784901122139
T3 - Proceedings of the 13th IAPR International Conference on Machine Vision Applications, MVA 2013
SP - 13
EP - 20
BT - Proceedings of the 13th IAPR International Conference on Machine Vision Applications, MVA 2013
PB - MVA Organization
T2 - 13th IAPR International Conference on Machine Vision Applications, MVA 2013
Y2 - 20 May 2013 through 23 May 2013
ER -