This paper presents an overview of our work on image parsing, which we define as the problem of labeling each pixel in an image with its semantic category. Our aim is to achieve broad coverage across hundreds of object categories, many of them sparsely sampled. We first describe our baseline nonparametric region-based parsing system. This approach is based on lazy learning, and it can easily scale to datasets with tens of thousands of images and hundreds of labels. We then present three extensions to this baseline system. First, we simultaneously label each region as a semantic class (e.g., tree, building, car) and geometric class (sky, vertical, ground) while enforcing coherence between the two label types (roads can’t be labeled as vertical). Second, we extend this simultaneous labeling to an arbitrary number of label types. For example, we may want to simultaneously label every image region according to its basic-level object category (car, building, road, tree, etc.), superordinate category (animal, vehicle, manmade object, natural object, etc.), geometric orientation (horizontal, vertical, etc.), and material (metal, glass, wood, etc.). Finally, we present a hybrid parsing system that combines our region-based system with per-exemplar sliding window detectors to improve parsing performance on small object classes, giving broader coverage.