Template matching is a fundamental operator in computer vision and is widely used in feature tracking, motion estimation, image alignment, and mosaicing. Under a certain parameterized warping model, the traditional template matching algorithm estimates the geometric warp parameters that minimize the SSD between the target and a warped template. The performance of the template matching can be characterized by deriving the distribution of warp parameter estimate as a function of the ideal template, the ideal warp parameters, and a given noise or perturbation model. In this paper, we assume a discretization of the warp parameter space and derive the theoretical expression for the probability mass function (PMF) of the parameter estimate. As the PMF is also a function of the template size, we can optimize the choice of the template or block size by determining the template/block size that gives the estimate with minimum entropy. Experimental results illustrate the correctness of the theory. An experiment involving feature point tracking in face video is shown to illustrate the robustness of the algorithm in a real-world problem.