This paper presents a novel representation for dynamic scenes composed of multiple rigid objects that may undergo different motions and be observed by a moving camera. Multi-view constraints associated with groups of affine-invariant scene patches and a normalized description of their appearance are used to segment a scene into its rigid parts, construct three-dimensional projective, affine, and Euclidean models of these parts, and match instances of models recovered from different image sequences. The proposed approach has been implemented, and it is applied to the detection and recognition of moving objects in video sequences and the identification of shots that depict the same scene in a video clip (shot matching).
|Original language||English (US)|
|Journal||Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition|
|State||Published - 2004|
|Event||Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2004 - Washington, DC, United States|
Duration: Jun 27 2004 → Jul 2 2004
ASJC Scopus subject areas
- Computer Vision and Pattern Recognition