Abstract
A landmark computer vision system that takes a single depth image containing a person and automatically estimates the pose of the person's body in 3D is described by Shannon and co-researchers. This novel method for pose estimation is the key to the Kinect's success. This tracking by detection approach has the potential for greater robustness because errors made over time are less likely to accumulate. It is enabled by an extremely efficient and reliable solution to the pose estimation problem. Shannon and co-researchers employ data-driven learning to address the tremendous variability in pose and appearance. Motion capture data was used to characterize the space of possible poses, actors performed gestures used in gaming and their joint angles were measured, resulting in a dataset of 100,000 poses. Given a single pose, a simulated depth image can be produced by transferring the pose to a character model and rendering the clothing and hair. The final idea is the use of discriminative part models to represent the body pose.
Original language | English (US) |
---|---|
Pages (from-to) | 115 |
Number of pages | 1 |
Journal | Communications of the ACM |
Volume | 56 |
Issue number | 1 |
DOIs | |
State | Published - Jan 2013 |
Externally published | Yes |
ASJC Scopus subject areas
- General Computer Science