Human pose regression through multiview visual fusion

Xu Zhao, Yun Fu, Huazhong Ning, Yuncai Liu, Thomas S. Huang

Research output: Contribution to journalArticlepeer-review


We consider the problem of estimating 3-D human body pose from visual signals within a discriminative framework. It is challenging because there is a wide gap between complex 3-D human motion and planar visual observation, which makes this a severely ill-conditioned problem. In this paper, we focus on three critical factors to tackle human body pose estimation, namely, feature extraction, learning algorithm, and camera utilization. On the feature level, we describe images using the salient interest points represented by scale-invariant feature transform (SIFT)-like descriptors, in which the position, appearance, and local structural information are encoded simultaneously. On the learning algorithm level, we propose to use Gaussian processes and multiple linear (ML) regression to model the mapping between poses and features. Fusing image information from multiple cameras in different views is of great interest to us on the camera level. We make a comprehensive evaluation on the HumanEva database and get two meaningful insights into the three crucial aspects for human pose estimation: 1) although the choice of feature is very important to the problem, once the learning algorithm becomes efficient, the choice of feature is no longer critical, and 2) the impact of information combination from multiple cameras on pose estimation is closely related to not only the quantity of image information, but also its quality. In most cases, it is true that the more information is involved, the better results can be achieved. But when the information quantity is the same, the differences in quality will lead to totally different performance. Furthermore, dense evaluations demonstrate that our approach is an accurate and robust solution to the human body pose estimation problem.

Original languageEnglish (US)
Article number5433014
Pages (from-to)957-966
Number of pages10
JournalIEEE Transactions on Circuits and Systems for Video Technology
Issue number7
StatePublished - Jul 2010


  • Gaussian processes regression
  • Human pose estimation
  • Image feature
  • Multiple views

ASJC Scopus subject areas

  • Media Technology
  • Electrical and Electronic Engineering


Dive into the research topics of 'Human pose regression through multiview visual fusion'. Together they form a unique fingerprint.

Cite this