Unsupervised 3D pose estimation with geometric self-supervision

Ching Hang Chen, Ambrish Tyagi, Amit Agrawal, Dylan Drover, Rohith Mv, Stefan Stojanov, James M. Rehg

Research output: Chapter in Book/Report/Conference proceedingConference contribution


We present an unsupervised learning approach to re-cover 3D human pose from 2D skeletal joints extracted from a single image. Our method does not require any multi-view image data, 3D skeletons, correspondences between 2D-3D points, or use previously learned 3D priors during training. A lifting network accepts 2D landmarks as inputs and generates a corresponding 3D skeleton estimate. Dur-ing training, the recovered 3D skeleton is reprojected on random camera viewpoints to generate new 'synthetic' 2D poses. By lifting the synthetic 2D poses back to 3D and re-projecting them in the original camera view, we can de-fine self-consistency loss both in 3D and in 2D. The training can thus be self supervised by exploiting the geometric self-consistency of the lift-reproject-lift process. We show that self-consistency alone is not sufficient to generate realistic skeletons, however adding a 2D pose discriminator enables the lifter to output valid 3D poses. Additionally, to learn from 2D poses 'in the wild', we train an unsupervised 2D domain adapter network to allow for an expansion of 2D data. This improves results and demonstrates the useful-ness of 2D pose data for unsupervised 3D lifting. Results on Human3.6M dataset for 3D human pose estimation demon-strate that our approach improves upon the previous un-supervised methods by 30% and outperforms many weakly supervised approaches that explicitly use 3D data.

Original languageEnglish (US)
Title of host publicationProceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019
PublisherIEEE Computer Society
Number of pages11
ISBN (Electronic)9781728132938
StatePublished - Jun 2019
Externally publishedYes
Event32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019 - Long Beach, United States
Duration: Jun 16 2019Jun 20 2019

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
ISSN (Print)1063-6919


Conference32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019
Country/TerritoryUnited States
CityLong Beach


  • 3D from Single Image
  • And Body Pose
  • Deep Learning
  • Face
  • Gesture

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition


Dive into the research topics of 'Unsupervised 3D pose estimation with geometric self-supervision'. Together they form a unique fingerprint.

Cite this