A Coarse-to-fine approach for fast deformable object detection

Problem Domain

We presented a method that dramatically accelerates object detection with part based models. The method is based on the observation that the cost of detection is likely to be dominated by the cost of matching each part to the image, and not by the cost of computing the optimal configuration of the parts as commonly assumed. Therefore accelerating detection requires minimizing the number of part-to-image comparisons.

To this end we propose a multiple-resolutions hierarchical part based model and a corresponding coarse-to-fine inference procedure that recursively eliminates from the search space unpromising part placements [1]. The method yields a ten-fold speed-up over the standard dynamic programming approach

Fig. 1: Coarse-to-fine inference.

We propose a method for the fast inference of multi-resolution part based models. Fig. 1(a) example detections; Fig. 1(b) scores obtained by matching the lowest resolution part (root filter) at all image locations; Fig. 1(c) scores obtained by matching the intermediate resolution parts, only at location selected based on the response of the root part; Fig. 1(d) scores obtained by matching the high resolution parts, only at locations selected based on the intermediate resolution scores. A white space indicates that the part is not matched at a certain image location, resulting in a computational saving. The saving increases with the resolution.

Our approach is complementary to the cascade-of-parts approach of [2]. Compared to the latter, our method does not have parameters to be determined empirically, which simplifies its use during the training of the model. Most importantly, the two techniques can be combined to obtain a very significant speed-up, of two orders of magnitude in some cases.


We evaluate our method extensively on the PASCAL VOC and INRIA datasets, demonstrating a very high increase in the detection speed with little degradation of the accuracy.

Detection AP and speed on the VOC 2007 test data.

Code Available


[1] M. Pedersoli, A. Vedaldi, J. Gonzàlez, " A Coarse-to-fine approach for fast deformable object detection ", in 24th IEEE Computer Vision and Pattern Recognition (CVPR2011), Colorado Springs, CO, June, 2011

[2] P. Felzenszwalb, R. Girshick, and D. McAllester, " Cascade object detection with deformable part models ". In CVPR, 2010.

[3] A. Vedaldi, V. Gulshan, M. Varma, and A. Zisserman. " Multiple kernels for object detection ". In ICCV, 2009.

[4] P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan. " Object detection with discriminatively trained part based models ". TPAMI, 32(9), 2010.

[5] L. Zhu, Y. Chen, A. Yuille, and W. Freeman. " Latent hierarchical structural learning for object detection ". In CVPR, 2010.