Due to the lack of explicit spatial consideration, existing epitome model may fail for image recognition and target detection, which directly motivates us to propose the so-called spatialized epitome in this paper. Extended from the original graphical model of epitome, the spatialized epitome provides a general framework to integrate both appearance and spatial arrangement of patches in the image to achieve a more precise likelihood representation for image(s) and eliminate ambiguities in image reconstruction and recognition. From the extended graphical model of epitome, an EM learning procedure is derived under the framework of variational approximation. The learning procedure can generate an optimized summary of the image appearance with spatial distribution of the similar patches. From the spatialized epitome, we present a principled way of inferring the probability of a new input image under the learnt model and thereby enabling image recognition and target detection. We show how the incorporation of spatial information enhances the epitome's ability for discrimination on several vision tasks, e.g., misalignment/cross-pose face recognition and vehicle detection with a few training samples.